03. Convolutional Neural Networks and Computer Vision with TensorFlow¶

We've done some basic TensorFlow stuff, along with some models. Let's get specific and learn a special kind of neural network called convolutional neural networks (CNNs), which helps detect patterns in visual data.

Note: Many different kinds of model architecture can be used for different problems in deep learning. You can use CNN in image data and text data, though there are some architectures that work better in other problems. So don't rely on one specific method.

For example:

  • Classify whether a picture is a pizza 🍕 or steak 🥩 (we'd be doing this)
  • Detect whether an object appears in an image (did a car just pass by the dashcam?)

What's covered:¶

  • Getting a dataset to work with
  • Architecture of a CNN (Convolutional Neural Network)
  • A quick end-to-end example (what we're working towards)
  • Steps in modelling for binary image classification in CNN
    • Becoming one with the data
    • Prepearing data for modelling
    • Creating a CNN model (starting with a baseline)
    • Fitting a model (getting it to find patterns in our data)
    • Evaluating a model
    • Improving a model
    • Making a prediction with a trained model
  • Steps in modelling for multi class image classification with CNN
  • Same as above (but with a different dataset)
In [2]:
import datetime
print(f"Notebook last run (end-to-end): {datetime.datetime.now()}")
Notebook last run (end-to-end): 2025-05-21 10:08:00.984057

Get the data¶

CNNs work very well with image data, and we're going to use image dataset to learn about them.

Image dataset will come from Food-101, a collection of 101 different categories of 101,000 real-world images of food dishes.

To begin, we'll create a binary classifier for pizza 🍕 and steak 🥩

there is a zip file created for pizza and steak, so no pre-processing is currently needed now.

In [4]:
import zipfile
import urllib.request

# Step 1: Download the zip file
url = "https://storage.googleapis.com/ztm_tf_course/food_vision/pizza_steak.zip"
urllib.request.urlretrieve(url, "pizza_steak.zip")  # saves the file locally

# Step 2: Unzip the file
with zipfile.ZipFile("pizza_steak.zip", "r") as zip_ref:
    zip_ref.extractall()  # extract all files to the current directory

Inpect the data (become one with it)¶

Its a very crucial step, where it usually is meant by visualizing and scanning through the folder of data you're working with. Try and understand what it is you're dealing with.

The file structure is formatted to what you may typically see when working with image datasets.

Example of what it looks like:

pizza_steak <- top level folder
└───train <- training images
│   └───pizza
│   │   │   1008104.jpg
│   │   │   1638227.jpg
│   │   │   ...      
│   └───steak
│       │   1000205.jpg
│       │   1647351.jpg
│       │   ...
│   
└───test <- testing images
│   └───pizza
│   │   │   1001116.jpg
│   │   │   1507019.jpg
│   │   │   ...      
│   └───steak
│       │   100274.jpg
│       │   1653815.jpg
│       │   ...

Let's inspect the directory, which can be done with listdir, short for list directory. Though first, we must import os

In [16]:
import os
os.listdir('pizza_steak')
Out[16]:
['test', 'train']
In [17]:
os.listdir('pizza_steak/train/')
Out[17]:
['steak', 'pizza']
In [18]:
files = os.listdir('pizza_steak/train/steak/')
cols = 10

# Print files in a grid
for i in range(0, len(files), cols):
    print("  ".join(files[i:i+cols]))
239025.jpg  1155665.jpg  3007772.jpg  1598345.jpg  658189.jpg  172936.jpg  3807440.jpg  168775.jpg  331860.jpg  2939678.jpg
2173084.jpg  1327667.jpg  468384.jpg  3074367.jpg  1487113.jpg  2568848.jpg  143490.jpg  2233395.jpg  3009617.jpg  2995169.jpg
1567554.jpg  268444.jpg  1403005.jpg  637374.jpg  2390628.jpg  2172600.jpg  2761427.jpg  3621464.jpg  3540750.jpg  134369.jpg
2832499.jpg  3253588.jpg  2291292.jpg  3260624.jpg  2090493.jpg  1219039.jpg  140832.jpg  955466.jpg  168006.jpg  2691461.jpg
1658443.jpg  786409.jpg  669180.jpg  3700079.jpg  1777107.jpg  2412263.jpg  2940621.jpg  1524526.jpg  3750472.jpg  3470083.jpg
413325.jpg  2979061.jpg  1093966.jpg  914570.jpg  2984311.jpg  361067.jpg  4176.jpg  669960.jpg  2779040.jpg  3393547.jpg
3829392.jpg  3556871.jpg  2394465.jpg  1971757.jpg  1548239.jpg  167069.jpg  217250.jpg  231296.jpg  3110387.jpg  1100074.jpg
101312.jpg  614975.jpg  3770370.jpg  3916407.jpg  1147883.jpg  2532239.jpg  417368.jpg  2724554.jpg  640539.jpg  3396589.jpg
644777.jpg  3002350.jpg  568972.jpg  2088195.jpg  2133717.jpg  1636831.jpg  763690.jpg  1987639.jpg  1907039.jpg  2916448.jpg
3382936.jpg  2232310.jpg  3180182.jpg  3671877.jpg  2324994.jpg  3047807.jpg  2940544.jpg  2890573.jpg  2603058.jpg  1961025.jpg
1615395.jpg  3315727.jpg  734445.jpg  3434983.jpg  3766099.jpg  2606444.jpg  285045.jpg  2738227.jpg  401651.jpg  3531805.jpg
356234.jpg  355715.jpg  2163079.jpg  1049459.jpg  1736968.jpg  3890465.jpg  2081995.jpg  388776.jpg  910672.jpg  726083.jpg
1375640.jpg  290850.jpg  1105280.jpg  2760475.jpg  444709.jpg  2878151.jpg  314359.jpg  1559052.jpg  2826987.jpg  2223787.jpg
234704.jpg  854150.jpg  1619357.jpg  2909031.jpg  1976160.jpg  2045647.jpg  1846706.jpg  141135.jpg  401094.jpg  1595869.jpg
3524429.jpg  3346787.jpg  3425047.jpg  2283057.jpg  911803.jpg  632427.jpg  3777020.jpg  762210.jpg  2643906.jpg  1236155.jpg
2535431.jpg  405794.jpg  1313316.jpg  126345.jpg  1382427.jpg  2138335.jpg  31881.jpg  2495903.jpg  2440131.jpg  2548974.jpg
987732.jpg  2490489.jpg  3670607.jpg  588739.jpg  830007.jpg  2374582.jpg  1344105.jpg  442757.jpg  2549316.jpg  525041.jpg
1849542.jpg  141056.jpg  256592.jpg  2088030.jpg  179293.jpg  1888450.jpg  3333735.jpg  1829088.jpg  1987213.jpg  3663518.jpg
534560.jpg  1324791.jpg  38442.jpg  1446401.jpg  2938012.jpg  3547166.jpg  252858.jpg  295491.jpg  1658186.jpg  2592401.jpg
1645470.jpg  1618011.jpg  1068975.jpg  2437268.jpg  1724717.jpg  3787809.jpg  270687.jpg  2159975.jpg  1343209.jpg  127029.jpg
2323132.jpg  2403907.jpg  1678108.jpg  2561199.jpg  813486.jpg  6926.jpg  3364420.jpg  1836332.jpg  931356.jpg  3528458.jpg
752203.jpg  3082120.jpg  358042.jpg  728020.jpg  979110.jpg  926414.jpg  3136.jpg  3142674.jpg  262321.jpg  2404884.jpg
2535456.jpg  40762.jpg  1969596.jpg  612551.jpg  2929179.jpg  10380.jpg  165964.jpg  1327567.jpg  1032846.jpg  184226.jpg
2062248.jpg  1966300.jpg  3621565.jpg  215222.jpg  2653594.jpg  1098844.jpg  3868959.jpg  3812039.jpg  1445352.jpg  2032669.jpg
3008192.jpg  187303.jpg  3777482.jpg  393494.jpg  676189.jpg  1225762.jpg  2936477.jpg  2287136.jpg  1367035.jpg  401144.jpg
1941807.jpg  1623325.jpg  2544643.jpg  1476404.jpg  3116018.jpg  229323.jpg  1392718.jpg  2859933.jpg  3030578.jpg  1816235.jpg
1053665.jpg  3388717.jpg  3162376.jpg  233964.jpg  2300845.jpg  1334054.jpg  2823872.jpg  812163.jpg  3143192.jpg  2286639.jpg
767442.jpg  146833.jpg  3606642.jpg  1241193.jpg  3113772.jpg  523535.jpg  3478318.jpg  2011264.jpg  1752330.jpg  32693.jpg
2395127.jpg  75537.jpg  802348.jpg  1870942.jpg  3693649.jpg  1714605.jpg  3867460.jpg  3274423.jpg  1000205.jpg  564530.jpg
2807888.jpg  1684438.jpg  952437.jpg  2599817.jpg  22080.jpg  636594.jpg  149682.jpg  358045.jpg  3664376.jpg  1512347.jpg
3671021.jpg  3095301.jpg  908261.jpg  3663800.jpg  461689.jpg  240435.jpg  3790962.jpg  3241894.jpg  3000131.jpg  2855315.jpg
3752362.jpg  2455944.jpg  1632774.jpg  1927984.jpg  3173444.jpg  1575322.jpg  2538000.jpg  1945132.jpg  1594719.jpg  3191589.jpg
1550997.jpg  1512226.jpg  9555.jpg  3094354.jpg  819027.jpg  2315295.jpg  2344227.jpg  541410.jpg  285147.jpg  203450.jpg
937133.jpg  1456841.jpg  2140776.jpg  2966859.jpg  482465.jpg  2881783.jpg  576725.jpg  183995.jpg  1746626.jpg  2038418.jpg
3322909.jpg  149087.jpg  3576078.jpg  320658.jpg  674001.jpg  3020591.jpg  703909.jpg  3743286.jpg  1539499.jpg  2268692.jpg
2260231.jpg  3171085.jpg  1823263.jpg  3492328.jpg  398288.jpg  2622140.jpg  3488748.jpg  3621562.jpg  2964732.jpg  443210.jpg
1508094.jpg  100135.jpg  2534567.jpg  2012996.jpg  483788.jpg  3330642.jpg  735441.jpg  3059843.jpg  2824680.jpg  1346387.jpg
3223400.jpg  3727491.jpg  2832960.jpg  2573392.jpg  214320.jpg  864997.jpg  1697339.jpg  368162.jpg  2271133.jpg  2928643.jpg
3830872.jpg  1081258.jpg  339891.jpg  3623556.jpg  923772.jpg  823766.jpg  1675632.jpg  663014.jpg  2437843.jpg  354329.jpg
2407770.jpg  2129685.jpg  3704103.jpg  2002400.jpg  664545.jpg  2912290.jpg  6709.jpg  2425062.jpg  3518960.jpg  493029.jpg
3130412.jpg  1606596.jpg  2818805.jpg  176508.jpg  3716881.jpg  80215.jpg  1421393.jpg  982988.jpg  250978.jpg  3223601.jpg
1787505.jpg  2125877.jpg  3788729.jpg  2357281.jpg  1682496.jpg  1209120.jpg  1925230.jpg  2526838.jpg  3142045.jpg  885571.jpg
1333055.jpg  1264858.jpg  2230959.jpg  1828502.jpg  2884233.jpg  952407.jpg  1930577.jpg  3140083.jpg  160552.jpg  644867.jpg
296268.jpg  3435358.jpg  3613455.jpg  2706403.jpg  2222018.jpg  996684.jpg  2300534.jpg  3609394.jpg  2499364.jpg  317206.jpg
3538682.jpg  1208405.jpg  838344.jpg  2862562.jpg  2034628.jpg  56240.jpg  3577618.jpg  2400975.jpg  536535.jpg  1163977.jpg
2707522.jpg  3438319.jpg  379737.jpg  1264050.jpg  3591821.jpg  1598885.jpg  1541672.jpg  2017387.jpg  3601483.jpg  2087958.jpg
3522209.jpg  2404695.jpg  15580.jpg  2628106.jpg  114601.jpg  2977966.jpg  756655.jpg  1190233.jpg  2927833.jpg  1351372.jpg
2050584.jpg  438871.jpg  2614189.jpg  961341.jpg  60655.jpg  1230968.jpg  3444407.jpg  1826066.jpg  2989882.jpg  2648423.jpg
1889336.jpg  2615718.jpg  187521.jpg  2435316.jpg  628628.jpg  2629750.jpg  368170.jpg  2916967.jpg  3386119.jpg  3334973.jpg
2052542.jpg  165639.jpg  1337814.jpg  1822407.jpg  2711806.jpg  3209173.jpg  1257104.jpg  2563233.jpg  2154126.jpg  2193684.jpg
326587.jpg  217996.jpg  1995118.jpg  1395906.jpg  234626.jpg  1312841.jpg  2668916.jpg  3099645.jpg  336637.jpg  2988960.jpg
745189.jpg  386335.jpg  1295457.jpg  2238681.jpg  1849463.jpg  313851.jpg  248841.jpg  3653129.jpg  3393688.jpg  1995252.jpg
853327.jpg  421476.jpg  2543081.jpg  2446660.jpg  1772039.jpg  3869679.jpg  2547797.jpg  704316.jpg  3335267.jpg  2893832.jpg
503589.jpg  1335556.jpg  660900.jpg  3100476.jpg  3298495.jpg  1428947.jpg  3855584.jpg  2216146.jpg  804684.jpg  2331076.jpg
2949079.jpg  2013535.jpg  714298.jpg  1213988.jpg  2768451.jpg  2509017.jpg  1839025.jpg  2042975.jpg  1624450.jpg  1942333.jpg
1853564.jpg  2392910.jpg  3245622.jpg  3578934.jpg  3247009.jpg  667075.jpg  1400760.jpg  2965021.jpg  2788759.jpg  3895825.jpg
945791.jpg  3469024.jpg  1628861.jpg  1761285.jpg  720060.jpg  43924.jpg  1588879.jpg  1362989.jpg  2716993.jpg  1289900.jpg
3536023.jpg  134598.jpg  2614649.jpg  2788312.jpg  2815172.jpg  3245533.jpg  2425389.jpg  2090504.jpg  947877.jpg  2495884.jpg
543691.jpg  980247.jpg  2880035.jpg  1647351.jpg  1736543.jpg  2403776.jpg  3571963.jpg  2146963.jpg  534633.jpg  2748917.jpg
286219.jpg  148916.jpg  1552530.jpg  330182.jpg  1724387.jpg  703556.jpg  97656.jpg  184110.jpg  332232.jpg  1248337.jpg
3724677.jpg  3204977.jpg  1650002.jpg  3736065.jpg  389739.jpg  1126126.jpg  2625330.jpg  3389138.jpg  2651300.jpg  3656752.jpg
3011642.jpg  1937872.jpg  2154779.jpg  681609.jpg  3577732.jpg  3466159.jpg  60633.jpg  2238802.jpg  822550.jpg  3157832.jpg
2365287.jpg  1433912.jpg  280284.jpg  3894222.jpg  1600179.jpg  922752.jpg  447557.jpg  1485083.jpg  37384.jpg  1600794.jpg
1829045.jpg  732986.jpg  2996324.jpg  2036920.jpg  3476564.jpg  465494.jpg  3727036.jpg  213765.jpg  606820.jpg  2765887.jpg
2396291.jpg  1984271.jpg  345734.jpg  2487306.jpg  2910418.jpg  1090122.jpg  332557.jpg  3271253.jpg  2514432.jpg  3280453.jpg
3333128.jpg  227576.jpg  3417789.jpg  3465327.jpg  2056627.jpg  1828969.jpg  56409.jpg  2796102.jpg  3128952.jpg  1453991.jpg
2136662.jpg  2254705.jpg  513842.jpg  368073.jpg  91432.jpg  3372616.jpg  1340977.jpg  199754.jpg  377190.jpg  1021458.jpg
1624747.jpg  1710569.jpg  2983260.jpg  1413972.jpg  2500292.jpg  3460673.jpg  1621763.jpg  1493169.jpg  616809.jpg  405173.jpg
3553911.jpg  440188.jpg  220341.jpg  2825100.jpg  2938151.jpg  2361812.jpg  1290362.jpg  2771149.jpg  1348047.jpg  1404770.jpg
2020613.jpg  560503.jpg  1371177.jpg  3375959.jpg  359330.jpg  1530833.jpg  225990.jpg  3142618.jpg  3306627.jpg  3168620.jpg
3140147.jpg  42125.jpg  3792514.jpg  2489716.jpg  561972.jpg  2327701.jpg  3326734.jpg  1849364.jpg  2856066.jpg  2458401.jpg
3381560.jpg  740090.jpg  1839481.jpg  1264154.jpg  2893892.jpg  1147047.jpg  444064.jpg  477486.jpg  1966967.jpg  2907177.jpg
1563266.jpg  2644457.jpg  2018173.jpg  827764.jpg  1869467.jpg  1212161.jpg  393349.jpg  907107.jpg  3781152.jpg  421561.jpg
510757.jpg  3640915.jpg  3707493.jpg  461187.jpg  590142.jpg  2443168.jpg  482022.jpg  3643951.jpg  1117936.jpg  2865730.jpg
40094.jpg  381162.jpg  513129.jpg  2619625.jpg  3745515.jpg  3857508.jpg  3159818.jpg  1068516.jpg  2661577.jpg  3335013.jpg
In [19]:
files = os.listdir('pizza_steak/train/pizza/')
cols=10

for i in range(0,len(files),cols):
    print(" ".join(files[i:i+cols]))
2577377.jpg 102037.jpg 384215.jpg 1033251.jpg 2312987.jpg 3057192.jpg 2501961.jpg 132484.jpg 1888911.jpg 3426946.jpg
2572958.jpg 2700543.jpg 143453.jpg 166823.jpg 670201.jpg 2667255.jpg 1763205.jpg 387697.jpg 375401.jpg 89892.jpg
29417.jpg 1083380.jpg 1663749.jpg 1499661.jpg 2293453.jpg 448519.jpg 928670.jpg 169720.jpg 1906287.jpg 2021516.jpg
1686908.jpg 792093.jpg 1839077.jpg 1325918.jpg 1207213.jpg 2126709.jpg 3170114.jpg 3066951.jpg 1968947.jpg 816577.jpg
1504421.jpg 2778214.jpg 951953.jpg 169318.jpg 2410138.jpg 3473991.jpg 1786840.jpg 2241448.jpg 1934355.jpg 63480.jpg
2952219.jpg 300869.jpg 1576248.jpg 299535.jpg 3434372.jpg 3256974.jpg 1617418.jpg 3910117.jpg 618021.jpg 3702863.jpg
2881282.jpg 2821034.jpg 2885796.jpg 2104569.jpg 2097315.jpg 2902766.jpg 3456440.jpg 3392671.jpg 369017.jpg 131561.jpg
724445.jpg 2739100.jpg 489347.jpg 1717790.jpg 1681043.jpg 2467990.jpg 3366256.jpg 709273.jpg 962785.jpg 3574192.jpg
3478964.jpg 2291093.jpg 1089334.jpg 368644.jpg 768276.jpg 1705747.jpg 3550805.jpg 3493457.jpg 1454995.jpg 121834.jpg
1287004.jpg 869763.jpg 3589437.jpg 3505182.jpg 762788.jpg 979955.jpg 3379038.jpg 1008941.jpg 3766476.jpg 1593835.jpg
1881674.jpg 2110257.jpg 1524655.jpg 1654444.jpg 3376519.jpg 59445.jpg 271675.jpg 1055065.jpg 928663.jpg 3382880.jpg
2292986.jpg 3023774.jpg 2462190.jpg 52934.jpg 1878005.jpg 2032236.jpg 2253670.jpg 1899562.jpg 13983.jpg 2187466.jpg
2569760.jpg 2827938.jpg 1054420.jpg 568383.jpg 3392649.jpg 452989.jpg 398565.jpg 2924941.jpg 663285.jpg 1951130.jpg
3383977.jpg 422261.jpg 238843.jpg 3164761.jpg 3798959.jpg 2722646.jpg 147785.jpg 3391208.jpg 2606727.jpg 3173779.jpg
2622336.jpg 2255361.jpg 576236.jpg 3185774.jpg 1423515.jpg 3464027.jpg 1806491.jpg 3742272.jpg 2687575.jpg 2231356.jpg
317861.jpg 1138936.jpg 2476468.jpg 2967846.jpg 2667244.jpg 1912976.jpg 1998483.jpg 259449.jpg 545561.jpg 1584379.jpg
1553353.jpg 2519291.jpg 3462250.jpg 1026922.jpg 3314176.jpg 604977.jpg 3401720.jpg 3629996.jpg 3803596.jpg 3871666.jpg
352051.jpg 3475936.jpg 3549765.jpg 2077999.jpg 2844278.jpg 2382016.jpg 2501636.jpg 759873.jpg 3554287.jpg 2155735.jpg
1742542.jpg 884986.jpg 626902.jpg 966644.jpg 1900585.jpg 358178.jpg 2161241.jpg 2421445.jpg 12718.jpg 1069629.jpg
3397336.jpg 3830773.jpg 3479936.jpg 3546278.jpg 320570.jpg 3196721.jpg 2577373.jpg 1898723.jpg 1608000.jpg 2481333.jpg
1828050.jpg 218711.jpg 1649276.jpg 665900.jpg 3663580.jpg 2671508.jpg 2091857.jpg 1351631.jpg 168879.jpg 1008844.jpg
937915.jpg 698251.jpg 517902.jpg 1425089.jpg 3913912.jpg 1289139.jpg 276803.jpg 1743389.jpg 54461.jpg 1981348.jpg
2172850.jpg 34632.jpg 2855844.jpg 175626.jpg 513754.jpg 1636299.jpg 790432.jpg 1285298.jpg 1008104.jpg 2441328.jpg
2224099.jpg 2723529.jpg 2121603.jpg 349946.jpg 2534774.jpg 242813.jpg 3712344.jpg 2188452.jpg 2126352.jpg 1757288.jpg
3082068.jpg 2995731.jpg 809024.jpg 819547.jpg 3793314.jpg 1403878.jpg 474493.jpg 1412034.jpg 976382.jpg 2885050.jpg
2010437.jpg 403431.jpg 3822139.jpg 3826377.jpg 3693710.jpg 340814.jpg 877881.jpg 1426781.jpg 1973447.jpg 3039549.jpg
2490163.jpg 23199.jpg 1413289.jpg 816729.jpg 1958364.jpg 1535273.jpg 2576168.jpg 248252.jpg 709947.jpg 899959.jpg
1270986.jpg 918506.jpg 1950499.jpg 2694223.jpg 1625147.jpg 741491.jpg 608085.jpg 2511911.jpg 812349.jpg 2640502.jpg
3653643.jpg 3425999.jpg 1890444.jpg 83538.jpg 2190018.jpg 1577871.jpg 1620761.jpg 326809.jpg 875856.jpg 1897129.jpg
3660716.jpg 2215531.jpg 3767723.jpg 2666066.jpg 82578.jpg 2811032.jpg 3082443.jpg 1123386.jpg 514014.jpg 1836888.jpg
3193599.jpg 1593665.jpg 3530210.jpg 3367113.jpg 1269960.jpg 282013.jpg 2285269.jpg 2035248.jpg 1390308.jpg 3168266.jpg
2760984.jpg 2432061.jpg 861771.jpg 1687681.jpg 1507039.jpg 1600705.jpg 872094.jpg 898303.jpg 896448.jpg 3790235.jpg
2473559.jpg 467986.jpg 3862243.jpg 2516510.jpg 3338774.jpg 3000535.jpg 2959665.jpg 3763593.jpg 370643.jpg 674188.jpg
667309.jpg 424288.jpg 489532.jpg 886505.jpg 1944600.jpg 3614525.jpg 1635386.jpg 893644.jpg 1907713.jpg 3864383.jpg
1678284.jpg 803243.jpg 1047561.jpg 3555299.jpg 2739039.jpg 2769168.jpg 3393898.jpg 2755875.jpg 1044524.jpg 2877565.jpg
1351146.jpg 1620560.jpg 2757327.jpg 2443498.jpg 1157438.jpg 3128495.jpg 938821.jpg 395034.jpg 262133.jpg 2486277.jpg
3055697.jpg 875262.jpg 214728.jpg 3512070.jpg 2951831.jpg 1504719.jpg 898843.jpg 3464858.jpg 2849924.jpg 3281494.jpg
3699992.jpg 1327402.jpg 2026009.jpg 132554.jpg 1984976.jpg 1247645.jpg 1646974.jpg 395960.jpg 652004.jpg 44449.jpg
233143.jpg 72716.jpg 2933332.jpg 764429.jpg 759025.jpg 3384856.jpg 2148129.jpg 394049.jpg 2247711.jpg 668944.jpg
1183278.jpg 2112757.jpg 56449.jpg 271779.jpg 382829.jpg 2487039.jpg 741883.jpg 2274117.jpg 1284978.jpg 1633289.jpg
1097980.jpg 138855.jpg 1964051.jpg 77677.jpg 3314535.jpg 3484590.jpg 2235981.jpg 2999507.jpg 2916034.jpg 2754150.jpg
1029698.jpg 1084888.jpg 1165451.jpg 3398309.jpg 350358.jpg 228778.jpg 712149.jpg 3441394.jpg 2785084.jpg 3401767.jpg
2711828.jpg 1708197.jpg 1649108.jpg 967694.jpg 5764.jpg 715169.jpg 682201.jpg 1573562.jpg 419516.jpg 1761451.jpg
755968.jpg 8917.jpg 3102271.jpg 1988629.jpg 702165.jpg 3399610.jpg 1243215.jpg 904938.jpg 2821048.jpg 626170.jpg
839461.jpg 2470671.jpg 333985.jpg 2922019.jpg 1399531.jpg 216720.jpg 32666.jpg 1524599.jpg 3191035.jpg 2570329.jpg
598381.jpg 40231.jpg 2321465.jpg 2448844.jpg 1048649.jpg 662526.jpg 38349.jpg 2412237.jpg 647215.jpg 717350.jpg
2330965.jpg 220910.jpg 3644733.jpg 3084957.jpg 2800325.jpg 274945.jpg 3772054.jpg 2556273.jpg 676432.jpg 2304021.jpg
3042454.jpg 898119.jpg 1895479.jpg 329302.jpg 1110966.jpg 1407753.jpg 413789.jpg 3653528.jpg 61822.jpg 3557127.jpg
338838.jpg 271592.jpg 739735.jpg 899818.jpg 1795316.jpg 3778801.jpg 917774.jpg 3675128.jpg 1688838.jpg 2142812.jpg
625687.jpg 401979.jpg 1202925.jpg 721383.jpg 2228322.jpg 1088332.jpg 1143057.jpg 2078208.jpg 2774899.jpg 3020376.jpg
337272.jpg 1336882.jpg 3906901.jpg 244505.jpg 3326344.jpg 1260554.jpg 2560539.jpg 929067.jpg 3109486.jpg 532970.jpg
985164.jpg 1326065.jpg 2965.jpg 134462.jpg 3704879.jpg 393658.jpg 2236914.jpg 2852301.jpg 82772.jpg 3312584.jpg
3917951.jpg 774142.jpg 970073.jpg 1248346.jpg 1871498.jpg 2412970.jpg 2980131.jpg 1468795.jpg 1159797.jpg 853441.jpg
1035854.jpg 372275.jpg 495892.jpg 1065078.jpg 105910.jpg 1038357.jpg 2992084.jpg 543556.jpg 2491110.jpg 1914969.jpg
1660415.jpg 866834.jpg 2502234.jpg 179165.jpg 878377.jpg 376417.jpg 868789.jpg 3264148.jpg 2019583.jpg 2705497.jpg
2078141.jpg 740385.jpg 3105724.jpg 1818014.jpg 593400.jpg 2581276.jpg 3514408.jpg 527199.jpg 910419.jpg 2397868.jpg
2557340.jpg 2621534.jpg 253127.jpg 1810844.jpg 2742044.jpg 2014717.jpg 3342039.jpg 2775763.jpg 302591.jpg 1267359.jpg
2044732.jpg 2331467.jpg 287000.jpg 1671531.jpg 2667824.jpg 998719.jpg 1344966.jpg 2989328.jpg 596494.jpg 3705479.jpg
129536.jpg 3063955.jpg 3337370.jpg 2135635.jpg 1248478.jpg 3767773.jpg 2697971.jpg 2584745.jpg 1899785.jpg 3845083.jpg
920595.jpg 465454.jpg 2602611.jpg 3597955.jpg 3269634.jpg 1173913.jpg 1205154.jpg 2664219.jpg 2456207.jpg 3214153.jpg
2301105.jpg 829229.jpg 920219.jpg 312479.jpg 790841.jpg 2137341.jpg 327415.jpg 2492287.jpg 3148119.jpg 835833.jpg
618348.jpg 2471646.jpg 2164255.jpg 1665654.jpg 430118.jpg 979998.jpg 3628930.jpg 27963.jpg 203831.jpg 1571074.jpg
579691.jpg 926046.jpg 2990186.jpg 93961.jpg 2639094.jpg 1105700.jpg 2439992.jpg 2587918.jpg 1383291.jpg 1370319.jpg
3333459.jpg 1044789.jpg 1075568.jpg 765799.jpg 2365046.jpg 1705773.jpg 1312761.jpg 765000.jpg 1008144.jpg 2605343.jpg
1870865.jpg 277963.jpg 3297714.jpg 2154394.jpg 2217956.jpg 218142.jpg 3860002.jpg 2468499.jpg 959901.jpg 3324050.jpg
3018077.jpg 1572608.jpg 807128.jpg 1076699.jpg 2005870.jpg 163039.jpg 220190.jpg 2693334.jpg 332231.jpg 269396.jpg
140031.jpg 1774438.jpg 401701.jpg 54540.jpg 1987634.jpg 2587921.jpg 2493954.jpg 1980167.jpg 2426686.jpg 2831983.jpg
2707814.jpg 3427699.jpg 874288.jpg 1245628.jpg 3207504.jpg 1011404.jpg 1234172.jpg 1040878.jpg 972000.jpg 2508157.jpg
1137400.jpg 947246.jpg 786995.jpg 1209973.jpg 232976.jpg 3479875.jpg 3766053.jpg 32004.jpg 2428085.jpg 3821701.jpg
1384464.jpg 2702825.jpg 2990023.jpg 3745884.jpg 413710.jpg 1638227.jpg 771878.jpg 68684.jpg 1670471.jpg 3882444.jpg
2019441.jpg 2361973.jpg 518527.jpg 3595758.jpg 823104.jpg 704161.jpg 111051.jpg 2155475.jpg 3703769.jpg 1915343.jpg
568995.jpg 2529205.jpg 857888.jpg 98617.jpg 2574896.jpg 2793535.jpg 199019.jpg 1512514.jpg 3443136.jpg 2451169.jpg
464388.jpg 3873326.jpg 2674351.jpg 1544197.jpg 2285942.jpg 1877103.jpg 1778167.jpg 3536393.jpg 221048.jpg 2280345.jpg
799874.jpg 1552253.jpg 1107714.jpg 3713343.jpg 3678290.jpg 12301.jpg 1947572.jpg 2224828.jpg 2670730.jpg 1652943.jpg
141507.jpg 2098014.jpg 714991.jpg 3749515.jpg 394590.jpg 1098197.jpg 307677.jpg 1916846.jpg 656817.jpg 2279642.jpg

There's a lot of images, but how many?

In [20]:
# walk through pizza_steak directory and list number of files
for dirpath, dirnames, filenames in os.walk('pizza_steak'):
    print(f"There are {len(dirnames)} directories and {len(filenames)} images in `{dirpath}`")
There are 2 directories and 0 images in `pizza_steak`
There are 2 directories and 0 images in `pizza_steak\test`
There are 0 directories and 250 images in `pizza_steak\test\steak`
There are 0 directories and 250 images in `pizza_steak\test\pizza`
There are 2 directories and 0 images in `pizza_steak\train`
There are 0 directories and 750 images in `pizza_steak\train\steak`
There are 0 directories and 750 images in `pizza_steak\train\pizza`
In [21]:
# another way to find number of images in a folder
num_steak_images_train = len(os.listdir('pizza_steak/train/steak/'))

num_steak_images_train
Out[21]:
750
In [22]:
# get list of class names (very useful if dealing with many categories and classes)
import pathlib
import numpy as np
data_dir = pathlib.Path('pizza_steak/train/') # turn our training path into python path
                                              # it turns our folder string, into a path object. A flexible way to work with file paths in python

class_names = np.array(sorted([item.name for item in data_dir.glob('*')])) # created a list of class_names from the subdirectories
                                                                           # data_dir.glob > gets all items in the directory
                                                                           # item.name for item > extracts only the name of each item
                                                                           # sorted(...) > sorts name by alphabetical order
                                                                           # np.array(...) > converts sorted list to Numpy array for efficient processing
class_names
Out[22]:
array(['pizza', 'steak'], dtype='<U5')

Based on above info, we've got 750 training images, and 250 test images of either steak or pizza.

Now let's visualize one of the images

In [23]:
# view an image
import matplotlib.pyplot as plt
import matplotlib.pyplot as mpimg
import random

def view_random_image(target_dir, target_class):
    # setup target directory
    target_folder = target_dir+target_class

    # get a random image path
    random_image = random.sample(os.listdir(target_folder), 1) # the 1, states to only pick `one` image

    # read in the image and plot it using matplotlib
    img = mpimg.imread(target_folder + '/' + random_image[0]) # [0], as img gets a list, we can't append it to directory unless it is a string. So you go into the list and do `[0]` for the first and only result
    plt.imshow(img)
    plt.title(target_class)
    plt.axis('off');

    print(f"Image shape: {img.shape}")

    return img
In [25]:
# view a random image from the training dataset
img = view_random_image(target_dir='pizza_steak/train/',
                        target_class='steak')
Image shape: (512, 512, 3)
No description has been provided for this image

You might notice we've printed the image shape with the actual image. This is due to how the computer sees the image, through a big array (tensor).

In [37]:
# how the computer views img 
img
Out[37]:
array([[[226, 156,  44],
        [220, 150,  38],
        [207, 137,  23],
        ...,
        [219, 170, 173],
        [212, 163, 166],
        [220, 171, 174]],

       [[223, 155,  46],
        [219, 151,  40],
        [212, 144,  33],
        ...,
        [209, 160, 163],
        [208, 159, 162],
        [212, 163, 166]],

       [[211, 146,  42],
        [209, 145,  39],
        [211, 147,  41],
        ...,
        [207, 159, 159],
        [209, 161, 161],
        [210, 162, 162]],

       ...,

       [[224, 192, 171],
        [227, 195, 174],
        [228, 196, 175],
        ...,
        [233, 207, 190],
        [236, 211, 191],
        [242, 217, 197]],

       [[222, 192, 168],
        [226, 196, 172],
        [228, 197, 176],
        ...,
        [234, 208, 191],
        [238, 213, 193],
        [244, 219, 199]],

       [[225, 195, 171],
        [229, 199, 175],
        [229, 198, 177],
        ...,
        [230, 204, 187],
        [232, 207, 187],
        [234, 209, 189]]], dtype=uint8)
In [38]:
img.shape # image shape, returning (width, height, colour channel)
Out[38]:
(512, 512, 3)

You see our image shape as a 3d array of width, height, colour channel. Width and height of image in our dataset may vary, though we will always have 3 colour channels, representing red, blue, and green.

Our img array also contains value between 0 to 255, as that's the possible range for red, blue, green, and also grayscale images as well.

So when building a model to differentiate pizza and steak, it will find patterns in these different pixel values to determine which class is which.

Note: Values are between 0 and 255, but ML would much prefer values between 0 and 1, so we'll need to scale/normalize it, by dividing it with 255.

In [39]:
# get all values within 0 to 1
img/255.0
Out[39]:
array([[[0.88627451, 0.61176471, 0.17254902],
        [0.8627451 , 0.58823529, 0.14901961],
        [0.81176471, 0.5372549 , 0.09019608],
        ...,
        [0.85882353, 0.66666667, 0.67843137],
        [0.83137255, 0.63921569, 0.65098039],
        [0.8627451 , 0.67058824, 0.68235294]],

       [[0.8745098 , 0.60784314, 0.18039216],
        [0.85882353, 0.59215686, 0.15686275],
        [0.83137255, 0.56470588, 0.12941176],
        ...,
        [0.81960784, 0.62745098, 0.63921569],
        [0.81568627, 0.62352941, 0.63529412],
        [0.83137255, 0.63921569, 0.65098039]],

       [[0.82745098, 0.57254902, 0.16470588],
        [0.81960784, 0.56862745, 0.15294118],
        [0.82745098, 0.57647059, 0.16078431],
        ...,
        [0.81176471, 0.62352941, 0.62352941],
        [0.81960784, 0.63137255, 0.63137255],
        [0.82352941, 0.63529412, 0.63529412]],

       ...,

       [[0.87843137, 0.75294118, 0.67058824],
        [0.89019608, 0.76470588, 0.68235294],
        [0.89411765, 0.76862745, 0.68627451],
        ...,
        [0.91372549, 0.81176471, 0.74509804],
        [0.9254902 , 0.82745098, 0.74901961],
        [0.94901961, 0.85098039, 0.77254902]],

       [[0.87058824, 0.75294118, 0.65882353],
        [0.88627451, 0.76862745, 0.6745098 ],
        [0.89411765, 0.77254902, 0.69019608],
        ...,
        [0.91764706, 0.81568627, 0.74901961],
        [0.93333333, 0.83529412, 0.75686275],
        [0.95686275, 0.85882353, 0.78039216]],

       [[0.88235294, 0.76470588, 0.67058824],
        [0.89803922, 0.78039216, 0.68627451],
        [0.89803922, 0.77647059, 0.69411765],
        ...,
        [0.90196078, 0.8       , 0.73333333],
        [0.90980392, 0.81176471, 0.73333333],
        [0.91764706, 0.81960784, 0.74117647]]])

A (typical) architecture of a convolutional neural network¶

CNN are no different than other deep learning neural networks, besides being able to be created in many different ways. Here's a list of components in a network:

Hyperparameter/layer type What it does? Typical values
Input image(s) Target images you want to discover patterns of Any type of photo (or video)
Input layer💙 Takes target image, and preprocesses them for further layers input_shape = [batch_size, img_height, img_width, channels]
Convolutional layer💧 Extracts/learns most important features in image Multiple, can create with tf.keras.layers.ConvXD (X can be multiple values)
Hidden activation💚 Adds non-linearity to learned features (non-straight lines) Usually ReLU (tf.keras.activations.relu)
Pooling layer💛 Reduces dimensionality/size of learned image features Average (tf.keras.layers.AvgPool2D) or Max(`tf.keras.layers.MaxPool2D)
Fully connected layer🧡 Further refines learned features from convolution layers tf.keras.layers.Dense
Output layer🧡 Takes learned features and outputs them in shape of target labels output_shape = [number_of_classes] (e.g. 3 for pizza, steak or sushi)
Output Activation🖤 Adds non-linearities to output layer tf.keras.activations.sigmoid (binary classification) or tf.keras.activations.softmax

A typical CNN model, colour coded to its respective layer type

![alt text]()

How they stack together:

alt text

A simple example of how you might stack together the above layers into a convolutional neural network. Note the convolutional and pooling layers can often be arranged and rearranged into many different formations

An end-to-end example¶

We've seen how there's 750 training, and 250 testing images of either class. It's time to jump to the deep end.

There's an original dataset author papers, where they stated their use of Random Forest machine learning models, averaging an accuracy of 50.76% when predicting different foods.

50.76% Will be our baseline to achieve.

Note: A baseline is a score/evaluaion metric you want to beat. Usually you start with a simple model, create a baseline based on that, and beat it through increasing complexity of model. A fun way to get your baseline is through some modelling paper with publiched results.

The following code underneath, replicates an end-to-end model for pizza_steak dataset, inclusive of CNN as it uses the components that has been explained above. We'll go through each steps later in the notebook.

The model used replicates TinyVGG. The computer vision architecture which fuels the CNN explainer webpage.

Resource: Architecture we're using is a scaled down version of VGG-16

In [1]:
!pip install tensorflow
Collecting tensorflow
  Downloading tensorflow-2.19.0-cp310-cp310-win_amd64.whl.metadata (4.1 kB)
Collecting absl-py>=1.0.0 (from tensorflow)
  Using cached absl_py-2.2.2-py3-none-any.whl.metadata (2.6 kB)
Collecting astunparse>=1.6.0 (from tensorflow)
  Using cached astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB)
Collecting flatbuffers>=24.3.25 (from tensorflow)
  Using cached flatbuffers-25.2.10-py2.py3-none-any.whl.metadata (875 bytes)
Collecting gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 (from tensorflow)
  Using cached gast-0.6.0-py3-none-any.whl.metadata (1.3 kB)
Collecting google-pasta>=0.1.1 (from tensorflow)
  Using cached google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes)
Collecting libclang>=13.0.0 (from tensorflow)
  Using cached libclang-18.1.1-py2.py3-none-win_amd64.whl.metadata (5.3 kB)
Collecting opt-einsum>=2.3.2 (from tensorflow)
  Using cached opt_einsum-3.4.0-py3-none-any.whl.metadata (6.3 kB)
Requirement already satisfied: packaging in x:\anaconda3\envs\tf_gpu\lib\site-packages (from tensorflow) (25.0)
Collecting protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<6.0.0dev,>=3.20.3 (from tensorflow)
  Downloading protobuf-5.29.4-cp310-abi3-win_amd64.whl.metadata (592 bytes)
Collecting requests<3,>=2.21.0 (from tensorflow)
  Using cached requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Requirement already satisfied: setuptools in x:\anaconda3\envs\tf_gpu\lib\site-packages (from tensorflow) (78.1.1)
Requirement already satisfied: six>=1.12.0 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from tensorflow) (1.17.0)
Collecting termcolor>=1.1.0 (from tensorflow)
  Using cached termcolor-3.1.0-py3-none-any.whl.metadata (6.4 kB)
Requirement already satisfied: typing-extensions>=3.6.6 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from tensorflow) (4.13.2)
Collecting wrapt>=1.11.0 (from tensorflow)
  Downloading wrapt-1.17.2-cp310-cp310-win_amd64.whl.metadata (6.5 kB)
Collecting grpcio<2.0,>=1.24.3 (from tensorflow)
  Downloading grpcio-1.71.0-cp310-cp310-win_amd64.whl.metadata (4.0 kB)
Collecting tensorboard~=2.19.0 (from tensorflow)
  Using cached tensorboard-2.19.0-py3-none-any.whl.metadata (1.8 kB)
Collecting keras>=3.5.0 (from tensorflow)
  Using cached keras-3.10.0-py3-none-any.whl.metadata (6.0 kB)
Collecting numpy<2.2.0,>=1.26.0 (from tensorflow)
  Downloading numpy-2.1.3-cp310-cp310-win_amd64.whl.metadata (60 kB)
Collecting h5py>=3.11.0 (from tensorflow)
  Downloading h5py-3.13.0-cp310-cp310-win_amd64.whl.metadata (2.5 kB)
Collecting ml-dtypes<1.0.0,>=0.5.1 (from tensorflow)
  Downloading ml_dtypes-0.5.1-cp310-cp310-win_amd64.whl.metadata (22 kB)
Collecting tensorflow-io-gcs-filesystem>=0.23.1 (from tensorflow)
  Downloading tensorflow_io_gcs_filesystem-0.31.0-cp310-cp310-win_amd64.whl.metadata (14 kB)
Collecting charset-normalizer<4,>=2 (from requests<3,>=2.21.0->tensorflow)
  Downloading charset_normalizer-3.4.2-cp310-cp310-win_amd64.whl.metadata (36 kB)
Collecting idna<4,>=2.5 (from requests<3,>=2.21.0->tensorflow)
  Using cached idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting urllib3<3,>=1.21.1 (from requests<3,>=2.21.0->tensorflow)
  Downloading urllib3-2.4.0-py3-none-any.whl.metadata (6.5 kB)
Collecting certifi>=2017.4.17 (from requests<3,>=2.21.0->tensorflow)
  Downloading certifi-2025.4.26-py3-none-any.whl.metadata (2.5 kB)
Collecting markdown>=2.6.8 (from tensorboard~=2.19.0->tensorflow)
  Downloading markdown-3.8-py3-none-any.whl.metadata (5.1 kB)
Collecting tensorboard-data-server<0.8.0,>=0.7.0 (from tensorboard~=2.19.0->tensorflow)
  Using cached tensorboard_data_server-0.7.2-py3-none-any.whl.metadata (1.1 kB)
Collecting werkzeug>=1.0.1 (from tensorboard~=2.19.0->tensorflow)
  Using cached werkzeug-3.1.3-py3-none-any.whl.metadata (3.7 kB)
Requirement already satisfied: wheel<1.0,>=0.23.0 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from astunparse>=1.6.0->tensorflow) (0.45.1)
Collecting rich (from keras>=3.5.0->tensorflow)
  Downloading rich-14.0.0-py3-none-any.whl.metadata (18 kB)
Collecting namex (from keras>=3.5.0->tensorflow)
  Using cached namex-0.0.9-py3-none-any.whl.metadata (322 bytes)
Collecting optree (from keras>=3.5.0->tensorflow)
  Downloading optree-0.15.0-cp310-cp310-win_amd64.whl.metadata (49 kB)
Collecting MarkupSafe>=2.1.1 (from werkzeug>=1.0.1->tensorboard~=2.19.0->tensorflow)
  Downloading MarkupSafe-3.0.2-cp310-cp310-win_amd64.whl.metadata (4.1 kB)
Collecting markdown-it-py>=2.2.0 (from rich->keras>=3.5.0->tensorflow)
  Using cached markdown_it_py-3.0.0-py3-none-any.whl.metadata (6.9 kB)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from rich->keras>=3.5.0->tensorflow) (2.19.1)
Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich->keras>=3.5.0->tensorflow)
  Using cached mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB)
Downloading tensorflow-2.19.0-cp310-cp310-win_amd64.whl (375.7 MB)
   ---------------------------------------- 0.0/375.7 MB ? eta -:--:--
   ---------------------------------------- 2.4/375.7 MB 11.2 MB/s eta 0:00:34
    --------------------------------------- 5.0/375.7 MB 12.1 MB/s eta 0:00:31
    --------------------------------------- 7.6/375.7 MB 12.7 MB/s eta 0:00:29
   - -------------------------------------- 9.7/375.7 MB 11.6 MB/s eta 0:00:32
   - -------------------------------------- 12.6/375.7 MB 11.8 MB/s eta 0:00:31
   - -------------------------------------- 15.2/375.7 MB 12.1 MB/s eta 0:00:30
   - -------------------------------------- 17.8/375.7 MB 12.1 MB/s eta 0:00:30
   -- ------------------------------------- 20.2/375.7 MB 11.9 MB/s eta 0:00:30
   -- ------------------------------------- 23.3/375.7 MB 12.2 MB/s eta 0:00:29
   -- ------------------------------------- 26.0/375.7 MB 12.3 MB/s eta 0:00:29
   --- ------------------------------------ 28.6/375.7 MB 12.2 MB/s eta 0:00:29
   --- ------------------------------------ 31.5/375.7 MB 12.3 MB/s eta 0:00:28
   --- ------------------------------------ 33.8/375.7 MB 12.3 MB/s eta 0:00:28
   --- ------------------------------------ 36.4/375.7 MB 12.3 MB/s eta 0:00:28
   ---- ----------------------------------- 38.5/375.7 MB 12.2 MB/s eta 0:00:28
   ---- ----------------------------------- 41.4/375.7 MB 12.1 MB/s eta 0:00:28
   ---- ----------------------------------- 44.3/375.7 MB 12.3 MB/s eta 0:00:28
   ---- ----------------------------------- 46.7/375.7 MB 12.3 MB/s eta 0:00:27
   ----- ---------------------------------- 49.3/375.7 MB 12.2 MB/s eta 0:00:27
   ----- ---------------------------------- 52.4/375.7 MB 12.3 MB/s eta 0:00:27
   ----- ---------------------------------- 55.1/375.7 MB 12.3 MB/s eta 0:00:26
   ------ --------------------------------- 57.1/375.7 MB 12.2 MB/s eta 0:00:27
   ------ --------------------------------- 59.2/375.7 MB 12.1 MB/s eta 0:00:27
   ------ --------------------------------- 62.1/375.7 MB 12.2 MB/s eta 0:00:26
   ------ --------------------------------- 65.0/375.7 MB 12.3 MB/s eta 0:00:26
   ------- -------------------------------- 67.4/375.7 MB 12.2 MB/s eta 0:00:26
   ------- -------------------------------- 70.0/375.7 MB 12.2 MB/s eta 0:00:26
   ------- -------------------------------- 71.6/375.7 MB 12.1 MB/s eta 0:00:26
   ------- -------------------------------- 74.2/375.7 MB 12.0 MB/s eta 0:00:26
   -------- ------------------------------- 76.8/375.7 MB 12.1 MB/s eta 0:00:25
   -------- ------------------------------- 79.4/375.7 MB 12.1 MB/s eta 0:00:25
   -------- ------------------------------- 82.3/375.7 MB 12.1 MB/s eta 0:00:25
   --------- ------------------------------ 85.2/375.7 MB 12.2 MB/s eta 0:00:24
   --------- ------------------------------ 87.6/375.7 MB 12.1 MB/s eta 0:00:24
   --------- ------------------------------ 90.7/375.7 MB 12.2 MB/s eta 0:00:24
   --------- ------------------------------ 93.3/375.7 MB 12.2 MB/s eta 0:00:24
   ---------- ----------------------------- 96.2/375.7 MB 12.2 MB/s eta 0:00:23
   ---------- ----------------------------- 98.3/375.7 MB 12.2 MB/s eta 0:00:23
   ---------- ---------------------------- 100.9/375.7 MB 12.2 MB/s eta 0:00:23
   ---------- ---------------------------- 103.8/375.7 MB 12.2 MB/s eta 0:00:23
   ----------- --------------------------- 106.7/375.7 MB 12.2 MB/s eta 0:00:23
   ----------- --------------------------- 109.3/375.7 MB 12.2 MB/s eta 0:00:22
   ----------- --------------------------- 111.9/375.7 MB 12.3 MB/s eta 0:00:22
   ----------- --------------------------- 114.8/375.7 MB 12.3 MB/s eta 0:00:22
   ------------ -------------------------- 117.4/375.7 MB 12.3 MB/s eta 0:00:22
   ------------ -------------------------- 120.6/375.7 MB 12.3 MB/s eta 0:00:21
   ------------ -------------------------- 123.2/375.7 MB 12.3 MB/s eta 0:00:21
   ------------- ------------------------- 125.8/375.7 MB 12.3 MB/s eta 0:00:21
   ------------- ------------------------- 128.5/375.7 MB 12.3 MB/s eta 0:00:21
   ------------- ------------------------- 131.1/375.7 MB 12.3 MB/s eta 0:00:20
   ------------- ------------------------- 134.0/375.7 MB 12.4 MB/s eta 0:00:20
   -------------- ------------------------ 136.8/375.7 MB 12.4 MB/s eta 0:00:20
   -------------- ------------------------ 139.7/375.7 MB 12.4 MB/s eta 0:00:20
   -------------- ------------------------ 142.6/375.7 MB 12.4 MB/s eta 0:00:19
   --------------- ----------------------- 145.2/375.7 MB 12.5 MB/s eta 0:00:19
   --------------- ----------------------- 148.4/375.7 MB 12.5 MB/s eta 0:00:19
   --------------- ----------------------- 151.0/375.7 MB 12.5 MB/s eta 0:00:19
   --------------- ----------------------- 153.9/375.7 MB 12.5 MB/s eta 0:00:18
   ---------------- ---------------------- 156.5/375.7 MB 12.5 MB/s eta 0:00:18
   ---------------- ---------------------- 159.4/375.7 MB 12.5 MB/s eta 0:00:18
   ---------------- ---------------------- 162.0/375.7 MB 12.5 MB/s eta 0:00:18
   ----------------- --------------------- 164.9/375.7 MB 12.5 MB/s eta 0:00:17
   ----------------- --------------------- 167.2/375.7 MB 12.5 MB/s eta 0:00:17
   ----------------- --------------------- 170.1/375.7 MB 12.5 MB/s eta 0:00:17
   ----------------- --------------------- 173.0/375.7 MB 12.5 MB/s eta 0:00:17
   ------------------ -------------------- 175.1/375.7 MB 12.5 MB/s eta 0:00:17
   ------------------ -------------------- 178.3/375.7 MB 12.5 MB/s eta 0:00:16
   ------------------ -------------------- 181.1/375.7 MB 12.5 MB/s eta 0:00:16
   ------------------- ------------------- 183.8/375.7 MB 12.5 MB/s eta 0:00:16
   ------------------- ------------------- 185.9/375.7 MB 12.5 MB/s eta 0:00:16
   ------------------- ------------------- 188.2/375.7 MB 12.5 MB/s eta 0:00:16
   ------------------- ------------------- 190.8/375.7 MB 12.5 MB/s eta 0:00:15
   -------------------- ------------------ 193.7/375.7 MB 12.5 MB/s eta 0:00:15
   -------------------- ------------------ 196.3/375.7 MB 12.5 MB/s eta 0:00:15
   -------------------- ------------------ 199.0/375.7 MB 12.5 MB/s eta 0:00:15
   -------------------- ------------------ 201.9/375.7 MB 12.5 MB/s eta 0:00:14
   --------------------- ----------------- 204.5/375.7 MB 12.5 MB/s eta 0:00:14
   --------------------- ----------------- 207.1/375.7 MB 12.5 MB/s eta 0:00:14
   --------------------- ----------------- 209.5/375.7 MB 12.5 MB/s eta 0:00:14
   ---------------------- ---------------- 212.3/375.7 MB 12.5 MB/s eta 0:00:14
   ---------------------- ---------------- 215.0/375.7 MB 12.5 MB/s eta 0:00:13
   ---------------------- ---------------- 217.8/375.7 MB 12.5 MB/s eta 0:00:13
   ---------------------- ---------------- 220.7/375.7 MB 12.5 MB/s eta 0:00:13
   ----------------------- --------------- 223.9/375.7 MB 12.6 MB/s eta 0:00:13
   ----------------------- --------------- 226.5/375.7 MB 12.6 MB/s eta 0:00:12
   ----------------------- --------------- 229.4/375.7 MB 12.6 MB/s eta 0:00:12
   ------------------------ -------------- 232.5/375.7 MB 12.6 MB/s eta 0:00:12
   ------------------------ -------------- 235.4/375.7 MB 12.6 MB/s eta 0:00:12
   ------------------------ -------------- 238.0/375.7 MB 12.6 MB/s eta 0:00:11
   ------------------------ -------------- 240.6/375.7 MB 12.6 MB/s eta 0:00:11
   ------------------------- ------------- 243.8/375.7 MB 12.6 MB/s eta 0:00:11
   ------------------------- ------------- 246.4/375.7 MB 12.6 MB/s eta 0:00:11
   ------------------------- ------------- 248.8/375.7 MB 12.6 MB/s eta 0:00:11
   -------------------------- ------------ 251.7/375.7 MB 12.6 MB/s eta 0:00:10
   -------------------------- ------------ 254.5/375.7 MB 12.6 MB/s eta 0:00:10
   -------------------------- ------------ 257.4/375.7 MB 12.6 MB/s eta 0:00:10
   -------------------------- ------------ 260.0/375.7 MB 12.6 MB/s eta 0:00:10
   --------------------------- ----------- 263.2/375.7 MB 12.6 MB/s eta 0:00:09
   --------------------------- ----------- 265.8/375.7 MB 12.6 MB/s eta 0:00:09
   --------------------------- ----------- 268.4/375.7 MB 12.6 MB/s eta 0:00:09
   ---------------------------- ---------- 271.3/375.7 MB 12.7 MB/s eta 0:00:09
   ---------------------------- ---------- 274.2/375.7 MB 12.7 MB/s eta 0:00:09
   ---------------------------- ---------- 276.6/375.7 MB 12.7 MB/s eta 0:00:08
   ----------------------------- --------- 279.4/375.7 MB 12.7 MB/s eta 0:00:08
   ----------------------------- --------- 282.3/375.7 MB 12.7 MB/s eta 0:00:08
   ----------------------------- --------- 285.2/375.7 MB 12.7 MB/s eta 0:00:08
   ----------------------------- --------- 287.8/375.7 MB 12.7 MB/s eta 0:00:07
   ------------------------------ -------- 291.0/375.7 MB 12.7 MB/s eta 0:00:07
   ------------------------------ -------- 293.6/375.7 MB 12.7 MB/s eta 0:00:07
   ------------------------------ -------- 296.2/375.7 MB 12.7 MB/s eta 0:00:07
   ------------------------------- ------- 298.8/375.7 MB 12.7 MB/s eta 0:00:07
   ------------------------------- ------- 302.0/375.7 MB 12.8 MB/s eta 0:00:06
   ------------------------------- ------- 304.6/375.7 MB 12.8 MB/s eta 0:00:06
   ------------------------------- ------- 307.5/375.7 MB 12.8 MB/s eta 0:00:06
   -------------------------------- ------ 310.1/375.7 MB 12.8 MB/s eta 0:00:06
   -------------------------------- ------ 313.0/375.7 MB 12.8 MB/s eta 0:00:05
   -------------------------------- ------ 315.9/375.7 MB 12.8 MB/s eta 0:00:05
   --------------------------------- ----- 318.8/375.7 MB 12.8 MB/s eta 0:00:05
   --------------------------------- ----- 320.9/375.7 MB 12.8 MB/s eta 0:00:05
   --------------------------------- ----- 323.5/375.7 MB 12.8 MB/s eta 0:00:05
   --------------------------------- ----- 326.4/375.7 MB 12.8 MB/s eta 0:00:04
   ---------------------------------- ---- 329.0/375.7 MB 12.8 MB/s eta 0:00:04
   ---------------------------------- ---- 331.4/375.7 MB 12.8 MB/s eta 0:00:04
   ---------------------------------- ---- 334.2/375.7 MB 12.9 MB/s eta 0:00:04
   ---------------------------------- ---- 336.9/375.7 MB 12.9 MB/s eta 0:00:04
   ----------------------------------- --- 339.5/375.7 MB 12.9 MB/s eta 0:00:03
   ----------------------------------- --- 341.8/375.7 MB 12.9 MB/s eta 0:00:03
   ----------------------------------- --- 344.7/375.7 MB 12.9 MB/s eta 0:00:03
   ------------------------------------ -- 347.9/375.7 MB 12.9 MB/s eta 0:00:03
   ------------------------------------ -- 351.0/375.7 MB 12.9 MB/s eta 0:00:02
   ------------------------------------ -- 353.9/375.7 MB 12.9 MB/s eta 0:00:02
   ------------------------------------ -- 356.0/375.7 MB 12.9 MB/s eta 0:00:02
   ------------------------------------- - 358.6/375.7 MB 12.9 MB/s eta 0:00:02
   ------------------------------------- - 358.6/375.7 MB 12.9 MB/s eta 0:00:02
   ------------------------------------- - 363.3/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  366.2/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  369.1/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  372.0/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  373.0/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------  375.7/375.7 MB 12.9 MB/s eta 0:00:01
   --------------------------------------- 375.7/375.7 MB 10.0 MB/s eta 0:00:00
Downloading grpcio-1.71.0-cp310-cp310-win_amd64.whl (4.3 MB)
   ---------------------------------------- 0.0/4.3 MB ? eta -:--:--
   -------------------------- ------------- 2.9/4.3 MB 14.0 MB/s eta 0:00:01
   ---------------------------------------- 4.3/4.3 MB 12.9 MB/s eta 0:00:00
Downloading ml_dtypes-0.5.1-cp310-cp310-win_amd64.whl (209 kB)
Downloading numpy-2.1.3-cp310-cp310-win_amd64.whl (12.9 MB)
   ---------------------------------------- 0.0/12.9 MB ? eta -:--:--
   ------- -------------------------------- 2.4/12.9 MB 12.2 MB/s eta 0:00:01
   --------------- ------------------------ 5.0/12.9 MB 12.6 MB/s eta 0:00:01
   ------------------------ --------------- 7.9/12.9 MB 12.8 MB/s eta 0:00:01
   -------------------------------- ------- 10.5/12.9 MB 13.1 MB/s eta 0:00:01
   ---------------------------------------- 12.9/12.9 MB 12.4 MB/s eta 0:00:00
Downloading protobuf-5.29.4-cp310-abi3-win_amd64.whl (434 kB)
Using cached requests-2.32.3-py3-none-any.whl (64 kB)
Downloading charset_normalizer-3.4.2-cp310-cp310-win_amd64.whl (105 kB)
Using cached idna-3.10-py3-none-any.whl (70 kB)
Using cached tensorboard-2.19.0-py3-none-any.whl (5.5 MB)
Using cached tensorboard_data_server-0.7.2-py3-none-any.whl (2.4 kB)
Downloading urllib3-2.4.0-py3-none-any.whl (128 kB)
Using cached absl_py-2.2.2-py3-none-any.whl (135 kB)
Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Downloading certifi-2025.4.26-py3-none-any.whl (159 kB)
Using cached flatbuffers-25.2.10-py2.py3-none-any.whl (30 kB)
Using cached gast-0.6.0-py3-none-any.whl (21 kB)
Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Downloading h5py-3.13.0-cp310-cp310-win_amd64.whl (3.0 MB)
   ---------------------------------------- 0.0/3.0 MB ? eta -:--:--
   ---------------------------------------  2.9/3.0 MB 12.9 MB/s eta 0:00:01
   ---------------------------------------- 3.0/3.0 MB 12.3 MB/s eta 0:00:00
Using cached keras-3.10.0-py3-none-any.whl (1.4 MB)
Using cached libclang-18.1.1-py2.py3-none-win_amd64.whl (26.4 MB)
Downloading markdown-3.8-py3-none-any.whl (106 kB)
Using cached opt_einsum-3.4.0-py3-none-any.whl (71 kB)
Downloading tensorflow_io_gcs_filesystem-0.31.0-cp310-cp310-win_amd64.whl (1.5 MB)
   ---------------------------------------- 0.0/1.5 MB ? eta -:--:--
   ---------------------------------------- 1.5/1.5 MB 11.2 MB/s eta 0:00:00
Using cached termcolor-3.1.0-py3-none-any.whl (7.7 kB)
Using cached werkzeug-3.1.3-py3-none-any.whl (224 kB)
Downloading MarkupSafe-3.0.2-cp310-cp310-win_amd64.whl (15 kB)
Downloading wrapt-1.17.2-cp310-cp310-win_amd64.whl (38 kB)
Using cached namex-0.0.9-py3-none-any.whl (5.8 kB)
Downloading optree-0.15.0-cp310-cp310-win_amd64.whl (297 kB)
Downloading rich-14.0.0-py3-none-any.whl (243 kB)
Using cached markdown_it_py-3.0.0-py3-none-any.whl (87 kB)
Using cached mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Installing collected packages: namex, libclang, flatbuffers, wrapt, urllib3, termcolor, tensorflow-io-gcs-filesystem, tensorboard-data-server, protobuf, optree, opt-einsum, numpy, mdurl, MarkupSafe, markdown, idna, grpcio, google-pasta, gast, charset-normalizer, certifi, astunparse, absl-py, werkzeug, requests, ml-dtypes, markdown-it-py, h5py, tensorboard, rich, keras, tensorflow

   - --------------------------------------  1/32 [libclang]
   - --------------------------------------  1/32 [libclang]
   - --------------------------------------  1/32 [libclang]
   --- ------------------------------------  3/32 [wrapt]
   ----- ----------------------------------  4/32 [urllib3]
   ----- ----------------------------------  4/32 [urllib3]
   ----- ----------------------------------  4/32 [urllib3]
   ----- ----------------------------------  4/32 [urllib3]
   ------- -------------------------------  6/32 [tensorflow-io-gcs-filesystem]
   ---------- -----------------------------  8/32 [protobuf]
   ---------- -----------------------------  8/32 [protobuf]
   ---------- -----------------------------  8/32 [protobuf]
   ---------- -----------------------------  8/32 [protobuf]
   ---------- -----------------------------  8/32 [protobuf]
   ---------- -----------------------------  8/32 [protobuf]
   ---------- -----------------------------  8/32 [protobuf]
   ----------- ----------------------------  9/32 [optree]
   ----------- ----------------------------  9/32 [optree]
   ------------ --------------------------- 10/32 [opt-einsum]
   ------------ --------------------------- 10/32 [opt-einsum]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   ------------- -------------------------- 11/32 [numpy]
   --------------- ------------------------ 12/32 [mdurl]
   ----------------- ---------------------- 14/32 [markdown]
   ----------------- ---------------------- 14/32 [markdown]
   ----------------- ---------------------- 14/32 [markdown]
   ----------------- ---------------------- 14/32 [markdown]
   ------------------ --------------------- 15/32 [idna]
   -------------------- ------------------- 16/32 [grpcio]
   -------------------- ------------------- 16/32 [grpcio]
   -------------------- ------------------- 16/32 [grpcio]
   -------------------- ------------------- 16/32 [grpcio]
   -------------------- ------------------- 16/32 [grpcio]
   --------------------- ------------------ 17/32 [google-pasta]
   --------------------- ------------------ 17/32 [google-pasta]
   ---------------------- ----------------- 18/32 [gast]
   ----------------------- ---------------- 19/32 [charset-normalizer]
   ----------------------- ---------------- 19/32 [charset-normalizer]
   --------------------------- ------------ 22/32 [absl-py]
   --------------------------- ------------ 22/32 [absl-py]
   --------------------------- ------------ 22/32 [absl-py]
   ---------------------------- ----------- 23/32 [werkzeug]
   ---------------------------- ----------- 23/32 [werkzeug]
   ---------------------------- ----------- 23/32 [werkzeug]
   ---------------------------- ----------- 23/32 [werkzeug]
   ---------------------------- ----------- 23/32 [werkzeug]
   ---------------------------- ----------- 23/32 [werkzeug]
   ------------------------------ --------- 24/32 [requests]
   ------------------------------ --------- 24/32 [requests]
   -------------------------------- ------- 26/32 [markdown-it-py]
   -------------------------------- ------- 26/32 [markdown-it-py]
   -------------------------------- ------- 26/32 [markdown-it-py]
   -------------------------------- ------- 26/32 [markdown-it-py]
   -------------------------------- ------- 26/32 [markdown-it-py]
   -------------------------------- ------- 26/32 [markdown-it-py]
   -------------------------------- ------- 26/32 [markdown-it-py]
   --------------------------------- ------ 27/32 [h5py]
   --------------------------------- ------ 27/32 [h5py]
   --------------------------------- ------ 27/32 [h5py]
   --------------------------------- ------ 27/32 [h5py]
   --------------------------------- ------ 27/32 [h5py]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ----------------------------------- ---- 28/32 [tensorboard]
   ------------------------------------ --- 29/32 [rich]
   ------------------------------------ --- 29/32 [rich]
   ------------------------------------ --- 29/32 [rich]
   ------------------------------------ --- 29/32 [rich]
   ------------------------------------ --- 29/32 [rich]
   ------------------------------------ --- 29/32 [rich]
   ------------------------------------ --- 29/32 [rich]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   ------------------------------------- -- 30/32 [keras]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   -------------------------------------- - 31/32 [tensorflow]
   ---------------------------------------- 32/32 [tensorflow]

Successfully installed MarkupSafe-3.0.2 absl-py-2.2.2 astunparse-1.6.3 certifi-2025.4.26 charset-normalizer-3.4.2 flatbuffers-25.2.10 gast-0.6.0 google-pasta-0.2.0 grpcio-1.71.0 h5py-3.13.0 idna-3.10 keras-3.10.0 libclang-18.1.1 markdown-3.8 markdown-it-py-3.0.0 mdurl-0.1.2 ml-dtypes-0.5.1 namex-0.0.9 numpy-2.1.3 opt-einsum-3.4.0 optree-0.15.0 protobuf-5.29.4 requests-2.32.3 rich-14.0.0 tensorboard-2.19.0 tensorboard-data-server-0.7.2 tensorflow-2.19.0 tensorflow-io-gcs-filesystem-0.31.0 termcolor-3.1.0 urllib3-2.4.0 werkzeug-3.1.3 wrapt-1.17.2
In [3]:
!pip install Pillow
Collecting Pillow
  Downloading pillow-11.2.1-cp310-cp310-win_amd64.whl.metadata (9.1 kB)
Downloading pillow-11.2.1-cp310-cp310-win_amd64.whl (2.7 MB)
   ---------------------------------------- 0.0/2.7 MB ? eta -:--:--
   ----------------------------------- ---- 2.4/2.7 MB 12.3 MB/s eta 0:00:01
   ---------------------------------------- 2.7/2.7 MB 11.9 MB/s eta 0:00:00
Installing collected packages: Pillow
Successfully installed Pillow-11.2.1
In [1]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Set the seed
tf.random.set_seed(42)

# Preprocess data (get all of the pixel values between 1 and 0, also called scaling/normalization)
train_datagen = ImageDataGenerator(rescale=1./255)
valid_datagen = ImageDataGenerator(rescale=1./255)

# Setup the train and test directories
train_dir = "pizza_steak/train/"
test_dir = "pizza_steak/test/"

# Import data from directories and turn it into batches
train_data = train_datagen.flow_from_directory(train_dir,
                                               batch_size=32, # number of images to process at a time 
                                               target_size=(224, 224), # convert all images to be 224 x 224
                                               class_mode="binary", # type of problem we're working on
                                               seed=42)

valid_data = valid_datagen.flow_from_directory(test_dir,
                                               batch_size=32,
                                               target_size=(224, 224),
                                               class_mode="binary",
                                               seed=42)

# Create a CNN model (same as Tiny VGG - https://poloclub.github.io/cnn-explainer/)
model_1 = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(filters=10, 
                         kernel_size=3, # can also be (3, 3)
                         activation="relu", 
                         input_shape=(224, 224, 3)), # first layer specifies input shape (height, width, colour channels)
  tf.keras.layers.Conv2D(10, 3, activation="relu"),
  tf.keras.layers.MaxPool2D(pool_size=2, # pool_size can also be (2, 2)
                            padding="valid"), # padding can also be 'same'
  tf.keras.layers.Conv2D(10, 3, activation="relu"),
  tf.keras.layers.Conv2D(10, 3, activation="relu"), # activation='relu' == tf.keras.layers.Activations(tf.nn.relu)
  tf.keras.layers.MaxPool2D(2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(1, activation="sigmoid") # binary activation output
])

# Compile the model
model_1.compile(loss="binary_crossentropy",
              optimizer=tf.keras.optimizers.Adam(),
              metrics=["accuracy"])

# Fit the model
history_1 = model_1.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=valid_data,
                        validation_steps=len(valid_data))
Found 1500 images belonging to 2 classes.
Found 500 images belonging to 2 classes.
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\trainers\data_adapters\py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
Epoch 1/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 27s 551ms/step - accuracy: 0.6127 - loss: 0.6567 - val_accuracy: 0.8200 - val_loss: 0.4221
Epoch 2/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 182ms/step - accuracy: 0.7963 - loss: 0.4422 - val_accuracy: 0.8440 - val_loss: 0.3758
Epoch 3/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 182ms/step - accuracy: 0.7963 - loss: 0.4386 - val_accuracy: 0.8660 - val_loss: 0.3287
Epoch 4/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 184ms/step - accuracy: 0.8481 - loss: 0.3722 - val_accuracy: 0.8580 - val_loss: 0.3748
Epoch 5/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 185ms/step - accuracy: 0.8452 - loss: 0.3756 - val_accuracy: 0.8700 - val_loss: 0.3321

Nice, we've achieved 87% accuracy in validation, way above the 50.78% initial goal! The model is a binary classification model, than 101 different categories. At least it shows the model being able to learn things.

Since we've already fit a model, let's check out its architecture.

In [2]:
# check out the layers in our model
model_1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 222, 222, 10)   │           280 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D)               │ (None, 220, 220, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 110, 110, 10)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_2 (Conv2D)               │ (None, 108, 108, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_3 (Conv2D)               │ (None, 106, 106, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_1 (MaxPooling2D)  │ (None, 53, 53, 10)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten)               │ (None, 28090)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 1)              │        28,091 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 93,305 (364.48 KB)
 Trainable params: 31,101 (121.49 KB)
 Non-trainable params: 0 (0.00 B)
 Optimizer params: 62,204 (242.99 KB)

With the layers in model_1, a lot of them can be understood through this CNN explainer website. Talking about convolutional layers, what they do, their activation functions, pooling, and flatten layers.

However there are a few others that haven't been discussed, namely:

  • The ImageDataGenerator class and the rescale parameter
  • The flow_from_directory() method
    • The batch_size parameter
    • The target_size parameter
  • Conv2D layers (and the parameters which come with them)
  • MaxPool2D layers (and their parameters)
  • The steps_per_epoch and validation_steps parameters in the fit() function

Before learning these, let's try fit a model we've worked on previously to our data.

Using the same model as before¶

To create an example of how neural networks can adapt to different problems, let's check out the binary classification model, and how it might work with our data.

We can use all the same parameters except two things:

  • The data - we'll be working with images instead of dots
  • The input shape - we need to tell the neural network, the shape of the image it will be working with.
    • Common practice is resizing all images to one size. For us, we'll resize to (224, 224, 3), aka an image width/height of 224 pixels, with 3 colour channels (RGB).
In [4]:
# set random seed
tf.random.set_seed(42)

# create a model to replicate the Tensorflow Playground model
model_2 = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(224,224,3)),
    tf.keras.layers.Dense(4, activation='relu'),
    tf.keras.layers.Dense(4, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# compile the model
model_2.compile(loss='binary_crossentropy',
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy'])

# fit the model
history_2 = model_2.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=valid_data,
                        validation_steps=len(valid_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\reshaping\flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
Epoch 1/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 21s 423ms/step - accuracy: 0.4962 - loss: 0.7066 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 2/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 5s 96ms/step - accuracy: 0.5518 - loss: 0.6847 - val_accuracy: 0.5000 - val_loss: 0.6932
Epoch 3/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 4s 94ms/step - accuracy: 0.5259 - loss: 0.6743 - val_accuracy: 0.6380 - val_loss: 0.6128
Epoch 4/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 4s 96ms/step - accuracy: 0.6336 - loss: 0.6272 - val_accuracy: 0.5000 - val_loss: 0.7052
Epoch 5/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 4s 93ms/step - accuracy: 0.4978 - loss: 0.7000 - val_accuracy: 0.5000 - val_loss: 0.6931

Hmm, don't think the model has learnt anything, only reaching 50% accuracy in the training and test set. In binary classification, that's as good as if it guessed.

In [5]:
# check the second model's architecture
model_2.summary()
Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ flatten_2 (Flatten)             │ (None, 150528)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_4 (Dense)                 │ (None, 4)              │       602,116 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_5 (Dense)                 │ (None, 4)              │            20 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_6 (Dense)                 │ (None, 1)              │             5 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,806,425 (6.89 MB)
 Trainable params: 602,141 (2.30 MB)
 Non-trainable params: 0 (0.00 B)
 Optimizer params: 1,204,284 (4.59 MB)

It's clear that the model has a much larger number of parameters compared to model_1.

2 has 600k+ trainable params, while 1 has 30k+ params, yet outperforms 2.

Note: trainable parameters are basically patterns for a model to learn from the data. You may think more is better, but its not always the case. The difference is the styles of models used. A series of Dense layers have a number of different learnable parameters thats connected to each other (which increases the number of possible learnable patterns). But a CNN model, learn the most important patterns in an image.

Since model_2 didn't work, in what way can we make it work?

Should we increase the number of layers? Maybe the increase of neurons per layer?

More specifically, we'll increase the number of neurons (also called hidden units) in each dense layer from 4 to 100, and add an extra layer.

Note: Adding extra layers/increasing number of neurons isknown as increasing the complexity of your model.

In [6]:
# set random seed
tf.random.set_seed(42)

# create a model similar to model_1 but add an extra layer and increase the number of hidden units in each layer
model_3 = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(224,224,3)), # dense layers expect a 1-dimensional vector as input
    tf.keras.layers.Dense(100, activation='relu'), # increase the number of neurons from 4 to 100 (for every layer)
    tf.keras.layers.Dense(100, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Compile the model
model_3.compile(loss='binary_crossentropy',
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy'])

# Fit the model
history_3 = model_3.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=valid_data,
                        validation_steps=len(valid_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\reshaping\flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
Epoch 1/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 10s 188ms/step - accuracy: 0.6247 - loss: 6.6889 - val_accuracy: 0.6380 - val_loss: 1.3323
Epoch 2/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 187ms/step - accuracy: 0.6997 - loss: 0.9636 - val_accuracy: 0.6620 - val_loss: 1.2410
Epoch 3/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 8s 177ms/step - accuracy: 0.7238 - loss: 0.9987 - val_accuracy: 0.6800 - val_loss: 1.0702
Epoch 4/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 8s 172ms/step - accuracy: 0.7212 - loss: 0.7914 - val_accuracy: 0.7520 - val_loss: 0.5467
Epoch 5/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 8s 174ms/step - accuracy: 0.7101 - loss: 0.8614 - val_accuracy: 0.7580 - val_loss: 0.4529

Model is definitely learning something, with 71% accuracy, and 75% accuracy in validation dataset!

Let's check out architecture again

In [8]:
# check out model_3 architecture
model_3.summary()
Model: "sequential_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ flatten_3 (Flatten)             │ (None, 150528)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_7 (Dense)                 │ (None, 100)            │    15,052,900 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_8 (Dense)                 │ (None, 100)            │        10,100 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_9 (Dense)                 │ (None, 1)              │           101 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 45,189,305 (172.38 MB)
 Trainable params: 15,063,101 (57.46 MB)
 Non-trainable params: 0 (0.00 B)
 Optimizer params: 30,126,204 (114.92 MB)

Despite 15 million parameters, model_1 still outperforms it with it's measly 30k parameters. Goes to show the power of CNNs, and their ability to learn patterns with less parameters.

Binary classification: Let's break it down¶

1. Become one with the data¶

Whatever type of data you're working with, you want to at least visualize 10-100 samples to start to building your own mental model of the data.

For our case, we may notice the steak images tend to have darker colours, and include side dishes with it. Where pizza tends to have a distict circular shape. These are the patterns our neural network may pick up as well.

You can also notice if some of your data is messed up (wrong label), and consider ways to fix it.

In [11]:
!pip install matplotlib
Collecting matplotlib
  Downloading matplotlib-3.10.3-cp310-cp310-win_amd64.whl.metadata (11 kB)
Collecting contourpy>=1.0.1 (from matplotlib)
  Downloading contourpy-1.3.2-cp310-cp310-win_amd64.whl.metadata (5.5 kB)
Collecting cycler>=0.10 (from matplotlib)
  Using cached cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib)
  Downloading fonttools-4.58.0-cp310-cp310-win_amd64.whl.metadata (106 kB)
Collecting kiwisolver>=1.3.1 (from matplotlib)
  Downloading kiwisolver-1.4.8-cp310-cp310-win_amd64.whl.metadata (6.3 kB)
Requirement already satisfied: numpy>=1.23 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from matplotlib) (2.1.3)
Requirement already satisfied: packaging>=20.0 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from matplotlib) (25.0)
Requirement already satisfied: pillow>=8 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from matplotlib) (11.2.1)
Collecting pyparsing>=2.3.1 (from matplotlib)
  Downloading pyparsing-3.2.3-py3-none-any.whl.metadata (5.0 kB)
Requirement already satisfied: python-dateutil>=2.7 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from matplotlib) (2.9.0.post0)
Requirement already satisfied: six>=1.5 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from python-dateutil>=2.7->matplotlib) (1.17.0)
Downloading matplotlib-3.10.3-cp310-cp310-win_amd64.whl (8.1 MB)
   ---------------------------------------- 0.0/8.1 MB ? eta -:--:--
   ---------- ----------------------------- 2.1/8.1 MB 11.8 MB/s eta 0:00:01
   -------------------------- ------------- 5.2/8.1 MB 12.7 MB/s eta 0:00:01
   ---------------------------------------  7.9/8.1 MB 13.2 MB/s eta 0:00:01
   ---------------------------------------- 8.1/8.1 MB 12.5 MB/s eta 0:00:00
Downloading contourpy-1.3.2-cp310-cp310-win_amd64.whl (221 kB)
Using cached cycler-0.12.1-py3-none-any.whl (8.3 kB)
Downloading fonttools-4.58.0-cp310-cp310-win_amd64.whl (2.2 MB)
   ---------------------------------------- 0.0/2.2 MB ? eta -:--:--
   ---------------------------------------- 2.2/2.2 MB 11.5 MB/s eta 0:00:00
Downloading kiwisolver-1.4.8-cp310-cp310-win_amd64.whl (71 kB)
Downloading pyparsing-3.2.3-py3-none-any.whl (111 kB)
Installing collected packages: pyparsing, kiwisolver, fonttools, cycler, contourpy, matplotlib

   ---------------------------------------- 0/6 [pyparsing]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   ------------- -------------------------- 2/6 [fonttools]
   -------------------- ------------------- 3/6 [cycler]
   -------------------------- ------------- 4/6 [contourpy]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   --------------------------------- ------ 5/6 [matplotlib]
   ---------------------------------------- 6/6 [matplotlib]

Successfully installed contourpy-1.3.2 cycler-0.12.1 fonttools-4.58.0 kiwisolver-1.4.8 matplotlib-3.10.3 pyparsing-3.2.3
In [31]:
# visualize data (requires function 'view_random_image' above)
import matplotlib.pyplot as plt

plt.figure()
plt.subplot(1,2,1)
steak_img = view_random_image('pizza_steak/train/','steak')
plt.subplot(1,2,2)
pizza_img = view_random_image('pizza_steak/train/','pizza')
Image shape: (512, 382, 3)
Image shape: (384, 512, 3)
No description has been provided for this image

2. Preprocess the data (prepare it for a model)¶

The most important step is creating a training and test set for the model. For us, the data has already been split between training and test. Another option could be to make a validation set, but we won't implement it now.

It's standard to separate train and test in it's own directories/folders, in each of their own classes.

To start, we define the train/test directory paths.

In [32]:
# define training and test directory paths
train_dir = 'pizza_steak/train/'
test_dir = 'pizza_steak/test/'

Next step is to turn data into batches.

A batch is a small subset of the dataset, the model looks at during training. Instead of 10,000 results at a time to figure out the patterns between all of them, they may look at 32 at a time instead.

It does this for a few reasons:

  • 10,000 images or more, may not fit into memory of the GPU
  • Trying to learn the patterns in 10,000 images in one go, can result in model to not learn very well.

Why 32? It's just simply better for model prediction. This is demonstrated by this Wilcon and Martinez's benchmark test

alt text

The results show how the smaller the batch size, the better the performance often is. Never go above 32!

To turn our data into catches, we'll create an instance of ImageDataGenerator for each of our datasets.

In [33]:
# create train an dtest data generators and rescale the data
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1/255.)
test_datagen = ImageDataGenerator(rescale=1/255.)

The ImageDataGenerator class helps prepare our images into batches, as well as perform transformations on them as they get loaded into the model.

You might've noticed the rescale parameter, and is an example of transformation we're doing. 255 is the number seen for colour values, or more so it's 'max'. Neural networks only really like to work with values between 0 to 1, and perform it's best when values are scaled in between that. So we'll divide the pixel value by it's max of 255, so even pixel values at max will be 1, fit for neural networks.

We can load our images from their respective directories using the flow_from_directories method.

In [34]:
# turn it into batches
train_data = train_datagen.flow_from_directory(directory=train_dir,
                                               target_size=(224,224),
                                               class_mode='binary',
                                               batch_size=32)

test_data = test_datagen.flow_from_directory(directory=test_dir,
                                             target_size=(224,224),
                                             class_mode='binary',
                                             batch_size=32)
Found 1500 images belonging to 2 classes.
Found 500 images belonging to 2 classes.

Now we have 1500 images belonging to 2 classes (pizza and steak), and 500 images to the same 2 classes.

Some things to note:

  • Because of the directory structure, the classes are inferenced by the subdirectory names in train/test folders.
  • The target_size parameter defines input size based on (width, height) format.
  • The class_mode value of 'binary' defines our classification problem type. If more than two categories, it becomes 'categorical'.
  • The batch_size defines how many images will be in each batch. We're currently using 32 as default.

We can take a look at our batched images and labels by inspecting the train_data object.

In [41]:
# get a sample of the training data
images, labels = next(train_data) # getting the next batch of images
len(images), len(labels)
Out[41]:
(32, 32)

Seems our data is in 32 batches as it should be.

Let's see the image's appearance

In [42]:
# get the first two images
images[:2], images[0].shape
Out[42]:
(array([[[[0.56078434, 0.63529414, 0.79215693],
          [0.5647059 , 0.6392157 , 0.7960785 ],
          [0.5647059 , 0.6392157 , 0.80392164],
          ...,
          [0.07843138, 0.08235294, 0.05882353],
          [0.08235294, 0.08235294, 0.07450981],
          [0.09803922, 0.09803922, 0.09803922]],
 
         [[0.5647059 , 0.6392157 , 0.7960785 ],
          [0.5568628 , 0.6313726 , 0.7960785 ],
          [0.5568628 , 0.6313726 , 0.7960785 ],
          ...,
          [0.09803922, 0.10196079, 0.07058824],
          [0.0627451 , 0.06666667, 0.04705883],
          [0.04313726, 0.04313726, 0.03529412]],
 
         [[0.5686275 , 0.6431373 , 0.8078432 ],
          [0.5647059 , 0.6392157 , 0.80392164],
          [0.5647059 , 0.6392157 , 0.8078432 ],
          ...,
          [0.07450981, 0.07843138, 0.04705883],
          [0.15686275, 0.16078432, 0.13725491],
          [0.21568629, 0.21960786, 0.20000002]],
 
         ...,
 
         [[0.3921569 , 0.34901962, 0.22352943],
          [0.39607847, 0.3529412 , 0.23529413],
          [0.3372549 , 0.28235295, 0.1764706 ],
          ...,
          [0.5372549 , 0.5294118 , 0.5803922 ],
          [0.5372549 , 0.5294118 , 0.5803922 ],
          [0.53333336, 0.5254902 , 0.5764706 ]],
 
         [[0.38431376, 0.34901962, 0.23529413],
          [0.34117648, 0.30588236, 0.19215688],
          [0.16862746, 0.12941177, 0.03137255],
          ...,
          [0.5372549 , 0.5294118 , 0.58431375],
          [0.5372549 , 0.5294118 , 0.58431375],
          [0.5411765 , 0.5254902 , 0.5803922 ]],
 
         [[0.17254902, 0.14901961, 0.05490196],
          [0.22352943, 0.20000002, 0.10588236],
          [0.21176472, 0.18039216, 0.09019608],
          ...,
          [0.5254902 , 0.5137255 , 0.5803922 ],
          [0.5294118 , 0.5137255 , 0.57254905],
          [0.5294118 , 0.5137255 , 0.5686275 ]]],
 
 
        [[[0.31764707, 0.39607847, 0.5019608 ],
          [0.38431376, 0.46274513, 0.56078434],
          [0.34117648, 0.427451  , 0.5176471 ],
          ...,
          [0.31764707, 0.24705884, 0.24705884],
          [0.28627452, 0.21176472, 0.21960786],
          [0.27058825, 0.19607845, 0.20392159]],
 
         [[0.27450982, 0.29803923, 0.43137258],
          [0.3137255 , 0.3372549 , 0.46274513],
          [0.3019608 , 0.3372549 , 0.45098042],
          ...,
          [0.32156864, 0.25882354, 0.25882354],
          [0.3019608 , 0.2392157 , 0.2392157 ],
          [0.28627452, 0.22352943, 0.22352943]],
 
         [[0.28627452, 0.26666668, 0.42352945],
          [0.32156864, 0.3137255 , 0.4666667 ],
          [0.32941177, 0.32941177, 0.47058827],
          ...,
          [0.30588236, 0.2509804 , 0.24705884],
          [0.29803923, 0.24313727, 0.2392157 ],
          [0.3137255 , 0.25882354, 0.24705884]],
 
         ...,
 
         [[0.18039216, 0.08627451, 0.14901961],
          [0.18039216, 0.08235294, 0.15686275],
          [0.1764706 , 0.07843138, 0.16078432],
          ...,
          [0.4784314 , 0.47450984, 0.4039216 ],
          [0.44705886, 0.4431373 , 0.37254903],
          [0.43529415, 0.43137258, 0.36078432]],
 
         [[0.18431373, 0.07843138, 0.13725491],
          [0.18039216, 0.08627451, 0.14901961],
          [0.1764706 , 0.08235294, 0.14509805],
          ...,
          [0.36862746, 0.3647059 , 0.29411766],
          [0.34117648, 0.34509805, 0.27450982],
          [0.34901962, 0.3529412 , 0.28235295]],
 
         [[0.18431373, 0.08235294, 0.13333334],
          [0.19215688, 0.09019608, 0.14117648],
          [0.18431373, 0.09019608, 0.14509805],
          ...,
          [0.5411765 , 0.53333336, 0.47450984],
          [0.5294118 , 0.53333336, 0.47058827],
          [0.5568628 , 0.56078434, 0.49803925]]]], dtype=float32),
 (224, 224, 3))

Images are (224,224,3) from our rescaling, and vary from 0 to 1, from normalization

how about labels?

In [43]:
# view labels
labels
Out[43]:
array([0., 1., 1., 0., 1., 0., 1., 1., 0., 1., 0., 0., 1., 1., 1., 0., 0.,
       0., 0., 1., 1., 1., 0., 0., 0., 0., 1., 1., 0., 1., 0., 1.],
      dtype=float32)

Our class mode is binary, and so labels are either 0 (pizza), and 1 (steak)

Our data is now ready, and we can have the model figure the patterns between image tensors and labels.

3. Create a model (start with a basline)¶

What is our default architecture? Well, there can be many possible answers to it.

A simple trial and error to getting your start on a vision model, is using the architecture that best performs with ImageNet (a large collection of diverse images to benchmark different computer vision models).

But before that, it's good to build a smaller model get a baseline result for you to improve upon.

Note: Deep learning terms for a small model, is less layers than (SOTA). Basically something with 3-4 layers, while ResNet50 has 50+ layers.

For our small model, we can try the one found in CNN explainer website (model_1 from above), then build a 3 layer convolutional neural network.

In [44]:
# things to import for our model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Activation
from tensorflow.keras import Sequential
In [45]:
# Create the model (can be our baseline with 3 convolutional layers)
model_4 = Sequential([
    Conv2D(filters=10, kernel_size=3, strides=1, padding='valid', activation='relu', input_shape=(224,224,3)),
    # filters = the number of features the model will learn
    # padding = determines whether to keep or discard original spatial/image size. valid = shrink down to where kernel_size can apply its filters
    Conv2D(10,3,activation='relu'),
    Conv2D(10,3,activation='relu'),
    Flatten(),
    Dense(1, activation='sigmoid')
])
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)

This follows a typical CNN structure of:

Input > Conv + ReLU layers (non-linearities) > Pooling layer > Fully connected (dense layer) as Output

Let's discuss some of the components of the Conv2D layer:

  • The "2D" means our inputs are two dimensional (height and width), even though they have 3 colour channels, the convolutions are run on each channel invididually.
  • filters - these are the number of "feature extractors" that will be moving over our images.
  • kernel_size - the size of our filters, for example, a kernel_size of (3, 3) (or just 3) will mean each filter will have the size 3x3, meaning it will look at a space of 3x3 pixels each time. The smaller the kernel, the more fine-grained features it will extract.
  • stride - the number of pixels a filter will move across as it covers the image. A stride of 1 means the filter moves across each pixel 1 by 1. A stride of 2 means it moves 2 pixels at a time.
  • padding - this can be either 'same' or 'valid', 'same' adds zeros the to outside of the image so the resulting output of the convolutional layer is the same as the input, where as 'valid' (default) cuts off excess pixels where the filter doesn't fit (e.g. 224 pixels wide divided by a kernel size of 3 (224/3 = 74.6) means 2 pixels will get cut off the end, as it leaves a remainder of 2.

Now let's compile the model

In [47]:
# compile the model
model_4.compile(loss='binary_crossentropy',
                optimizer=Adam(),
                metrics=['accuracy'])

4. Fit a model¶

It's time to fit our mode. But you may notice two extra parameters:

  • steps_per_epoch > this is the number of batches the model will go through. If we have a batch of 32, and steps is 10, then it will go through a total of 10 batches, aka 320 images. We want it to go through all images in our train_data. (1500 images / 32 batches ~ 47 steps)

  • validation_steps > same as above but for validation data (aka our test folder). 500 images / 32 batches ~ 16 steps.

In [48]:
# check length of training and test data generators
len(train_data), len(test_data)
Out[48]:
(47, 16)
In [49]:
# fit the model
history_4 = model_4.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=test_data,
                        validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\trainers\data_adapters\py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
Epoch 1/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 28s 541ms/step - accuracy: 0.5709 - loss: 1.4794 - val_accuracy: 0.7260 - val_loss: 0.5936
Epoch 2/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 9s 197ms/step - accuracy: 0.7388 - loss: 0.5303 - val_accuracy: 0.8020 - val_loss: 0.4486
Epoch 3/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 10s 205ms/step - accuracy: 0.8686 - loss: 0.3660 - val_accuracy: 0.7660 - val_loss: 0.4861
Epoch 4/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 10s 214ms/step - accuracy: 0.9386 - loss: 0.1896 - val_accuracy: 0.8340 - val_loss: 0.4227
Epoch 5/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 10s 215ms/step - accuracy: 0.9898 - loss: 0.0587 - val_accuracy: 0.8460 - val_loss: 0.4319

5. Evaluate the model¶

Oh yeah! Looks like our model is learning something!

Let's check the training curves

In [52]:
!pip install pandas
Collecting pandas
  Downloading pandas-2.2.3-cp310-cp310-win_amd64.whl.metadata (19 kB)
Requirement already satisfied: numpy>=1.22.4 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from pandas) (2.1.3)
Requirement already satisfied: python-dateutil>=2.8.2 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from pandas) (2.9.0.post0)
Collecting pytz>=2020.1 (from pandas)
  Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Requirement already satisfied: six>=1.5 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)
Downloading pandas-2.2.3-cp310-cp310-win_amd64.whl (11.6 MB)
   ---------------------------------------- 0.0/11.6 MB ? eta -:--:--
   -------- ------------------------------- 2.4/11.6 MB 13.4 MB/s eta 0:00:01
   ------------------ --------------------- 5.2/11.6 MB 12.7 MB/s eta 0:00:01
   ---------------------------- ----------- 8.1/11.6 MB 13.2 MB/s eta 0:00:01
   ------------------------------------- -- 11.0/11.6 MB 13.5 MB/s eta 0:00:01
   ---------------------------------------- 11.6/11.6 MB 13.0 MB/s eta 0:00:00
Downloading pytz-2025.2-py2.py3-none-any.whl (509 kB)
Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB)
Installing collected packages: pytz, tzdata, pandas

   ---------------------------------------- 0/3 [pytz]
   ---------------------------------------- 0/3 [pytz]
   ---------------------------------------- 0/3 [pytz]
   ------------- -------------------------- 1/3 [tzdata]
   ------------- -------------------------- 1/3 [tzdata]
   ------------- -------------------------- 1/3 [tzdata]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   -------------------------- ------------- 2/3 [pandas]
   ---------------------------------------- 3/3 [pandas]

Successfully installed pandas-2.2.3 pytz-2025.2 tzdata-2025.2
In [54]:
# plot the training curves
import pandas as pd
pd.DataFrame(history_4.history).plot(figsize=(10,7));
No description has been provided for this image

accuracy almost seems like it's reach 100% accuracy after 5 epochs. But val_accuracy has some struggles maintaining accuracy. There's a possibility the model is overfitting.

Let's separate the loss curves from accuracy to get a clearer picture

In [64]:
# plot the loss and accuracy data separately
def plot_loss_curves(history):
    """
    Returns separate loss curves for training and validation metrics.
    """
    loss = history.history['loss']
    val_loss = history.history['val_loss']

    accuracy = history.history['accuracy']
    val_accuracy = history.history['val_accuracy']

    epochs = range(len(history.history['loss']))

    # plot loss
    plt.figure()
    plt.plot(epochs, loss, label='Training Loss')
    plt.plot(epochs, val_loss, label='Validation Loss')
    plt.title('Loss')
    plt.xlabel('Epochs')
    plt.legend();

    # plot accuracy
    plt.figure()
    plt.plot(epochs, accuracy, label='Training Accuracy')
    plt.plot(epochs, val_accuracy, label='Validation Accuracy')
    plt.title('Accuracy')
    plt.xlabel('Epochs')
    plt.legend();
In [65]:
# check out loss and accuracy curves in model_4
plot_loss_curves(history_4)
No description has been provided for this image
No description has been provided for this image

The ideal position of validation, is trailing behind training in terms of accuracy and loss. If you notice that it's growing a big gap overtime, its likely overfitting.

In [66]:
# check model's architecture
model_4.summary()
Model: "sequential_4"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_4 (Conv2D)               │ (None, 222, 222, 10)   │           280 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_5 (Conv2D)               │ (None, 220, 220, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_6 (Conv2D)               │ (None, 218, 218, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_4 (Flatten)             │ (None, 475240)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_10 (Dense)                │ (None, 1)              │       475,241 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 1,432,025 (5.46 MB)
 Trainable params: 477,341 (1.82 MB)
 Non-trainable params: 0 (0.00 B)
 Optimizer params: 954,684 (3.64 MB)

6. Adjust the model parameters¶

There's 3 steps to fitting a ML model:

  1. Create a baseline
  2. Beat the baseline by overfitting a larger model
  3. Reduce overfitting

We've done steps 1 and 2, but there are other ways to continue overfitting:

  • Increase the number of convoluional layers
  • Increase the number of convolutional filters
  • Add another dense layer to the output of our flattened layer

Our focus now, is to get the training and validation training curves closer together, aka reduce overfitting.

But why care about overfitting if training accuracy is so good? If a model performs so well at training data, but not so well in validation, the model has a porr ability at prediction with unseen data. It's not able to generalize and learn about the real world patterns, but instead learning patterns that aren't supposed to be patterns in the training data.

So for the next few models we build, we'll adjust the number of parameters and inspect the training curves as well.

We'll build 2 models:

  • A CNN with max pooling
  • A CNN with mac pooling and data augmentation

For the first model, we'll follow this CNN structure:

Input > Conv layers + ReLU layers > Max pooling layers > Fully connected (Dense) layers as output

This model will have the same structure as model_4, but with Max Pooling layer included

In [67]:
# create the model (a 3 layer convolutional neural network)
model_5 = Sequential([
    Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
    MaxPool2D(pool_size=2), # reduces number of features by half
    Conv2D(10,3,activation='relu'),
    MaxPool2D(), # doing this for every CNN layer
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Flatten(),
    Dense(1,activation='sigmoid')
])
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
In [68]:
# compile the model
model_5.compile(loss='binary_crossentropy',
                optimizer=Adam(),
                metrics=['accuracy'])
In [70]:
# fit the model
history_5 = model_5.fit(train_data,
                    epochs=5,
                    steps_per_epoch=len(train_data),
                    validation_data=test_data,
                    validation_steps=len(test_data))
Epoch 1/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 25s 491ms/step - accuracy: 0.6468 - loss: 0.6387 - val_accuracy: 0.8000 - val_loss: 0.4548
Epoch 2/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 6s 130ms/step - accuracy: 0.7494 - loss: 0.5103 - val_accuracy: 0.8000 - val_loss: 0.4095
Epoch 3/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 6s 132ms/step - accuracy: 0.8167 - loss: 0.4251 - val_accuracy: 0.8260 - val_loss: 0.3699
Epoch 4/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 6s 132ms/step - accuracy: 0.8250 - loss: 0.3963 - val_accuracy: 0.8360 - val_loss: 0.3585
Epoch 5/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 6s 132ms/step - accuracy: 0.8144 - loss: 0.3979 - val_accuracy: 0.8740 - val_loss: 0.3554

It seems model_5 is performing worse on training data, but is better on validation data instead.

Before checking training curves, let's see the architecture

In [71]:
# check model architecture
model_5.summary()
Model: "sequential_5"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_7 (Conv2D)               │ (None, 222, 222, 10)   │           280 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_2 (MaxPooling2D)  │ (None, 111, 111, 10)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_8 (Conv2D)               │ (None, 109, 109, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_3 (MaxPooling2D)  │ (None, 54, 54, 10)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_9 (Conv2D)               │ (None, 52, 52, 10)     │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_4 (MaxPooling2D)  │ (None, 26, 26, 10)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_5 (Flatten)             │ (None, 6760)           │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_11 (Dense)                │ (None, 1)              │         6,761 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 26,585 (103.85 KB)
 Trainable params: 8,861 (34.61 KB)
 Non-trainable params: 0 (0.00 B)
 Optimizer params: 17,724 (69.24 KB)

Notice how MaxPooling2D is halving the output shape everytime it's applied, while trying to extract the most important data within the 2x2 square before shrinking down.

The bigger the pool_size, the more that max pooling will squeeze the features out of the image. But if too big, there wouldn't be enough features to keep, thus learning nothing.

It's a major reduction in trainable parameters. To 8,861 from 477,431 in model_4

Time to check loss curves

In [72]:
# plot loss curves of model_5 results
plot_loss_curves(history_5)
No description has been provided for this image
No description has been provided for this image

We can see the curves are closer to each other, but the validation loss looks like it's flattening out, risking the potential of overfitting again.

Time to try other overfitting prevention, and also data augmentation.

We will first observe how it's done with code, then explain it's guts. For data aufmentation, we'll need to reinstate ImageDataGenerator instance.

In [75]:
# create ImageDataGenerator training instance with data augmentation
train_datagen_augmentation = ImageDataGenerator(rescale=1/255.,
                                                rotation_range=20, # rotate the image between 0 and 20 degrees,
                                                shear_range=0.2, # shear the image,
                                                zoom_range=0.2, # zoom into the image,
                                                width_shift_range=0.2, # shift the image width ways,
                                                height_shift_range=0.2, # shift the image height ways,
                                                horizontal_flip=True) # flipping image in horizontal axis

# create imagedatagenerator training instance without data augmentation
train_datagen = ImageDataGenerator(rescale=1/255.)

# create imagedatagenerator test instance without data augmentation
test_datagen = ImageDataGenerator(rescale=1/255.)

Now, what's data augmentation?

It's the process of altering our training data by a slight margin. This gives more diversity in training data, and allows the model to learn more generalizable patterns. Whether that be rotating the image slightly, flipping image horizontally, or slight cropping of the edges.

This helps simulate the data a model may encouter in the real world.

Note: Data augmentation is only ever performed on training data. ImageDataGenerator does augmentation on random lots of images, but leave them that way once training commences. If done on validation/test, we wouldn't be able to properly compare, as random images are changed, every time it's being validated.

In [82]:
# import data and augment it from training directory
print('Augmented training images:')
train_data_augmented = train_datagen_augmentation.flow_from_directory(train_dir,
                                                        target_size=(224,224),
                                                        batch_size=32,
                                                        class_mode='binary',
                                                        shuffle=False) # keeping this for demonstration purposes. But it's good to shuffle

# create non-augmented data batches
print('Non-augmented training images:')
train_data = train_datagen.flow_from_directory(train_dir,
                                               target_size=(224,224),
                                               batch_size=32,
                                               class_mode='binary',
                                               shuffle=False)

print('Unchanged test images:')
test_data = test_datagen.flow_from_directory(test_dir,
                                             target_size=(224,224),
                                             batch_size=32,
                                             class_mode='binary',
                                             shuffle='binary')
Augmented training images:
Found 1500 images belonging to 2 classes.
Non-augmented training images:
Found 1500 images belonging to 2 classes.
Unchanged test images:
Found 500 images belonging to 2 classes.

Let's visualize the augmented, and non augmented data

In [84]:
!pip install scipy
Collecting scipy
  Downloading scipy-1.15.3-cp310-cp310-win_amd64.whl.metadata (60 kB)
Requirement already satisfied: numpy<2.5,>=1.23.5 in x:\anaconda3\envs\tf_gpu\lib\site-packages (from scipy) (2.1.3)
Downloading scipy-1.15.3-cp310-cp310-win_amd64.whl (41.3 MB)
   ---------------------------------------- 0.0/41.3 MB ? eta -:--:--
   -- ------------------------------------- 2.6/41.3 MB 12.5 MB/s eta 0:00:04
   ----- ---------------------------------- 5.8/41.3 MB 13.6 MB/s eta 0:00:03
   -------- ------------------------------- 8.4/41.3 MB 13.3 MB/s eta 0:00:03
   ---------- ----------------------------- 11.0/41.3 MB 13.0 MB/s eta 0:00:03
   ------------- -------------------------- 13.9/41.3 MB 13.2 MB/s eta 0:00:03
   ---------------- ----------------------- 17.0/41.3 MB 13.4 MB/s eta 0:00:02
   ------------------- -------------------- 20.2/41.3 MB 13.6 MB/s eta 0:00:02
   ---------------------- ----------------- 23.1/41.3 MB 13.5 MB/s eta 0:00:02
   ------------------------- -------------- 26.0/41.3 MB 13.6 MB/s eta 0:00:02
   --------------------------- ------------ 28.0/41.3 MB 13.2 MB/s eta 0:00:02
   ----------------------------- ---------- 30.7/41.3 MB 13.1 MB/s eta 0:00:01
   -------------------------------- ------- 33.6/41.3 MB 13.2 MB/s eta 0:00:01
   ----------------------------------- ---- 36.4/41.3 MB 13.2 MB/s eta 0:00:01
   ------------------------------------- -- 39.1/41.3 MB 13.1 MB/s eta 0:00:01
   ---------------------------------------  41.2/41.3 MB 13.0 MB/s eta 0:00:01
   ---------------------------------------- 41.3/41.3 MB 12.3 MB/s eta 0:00:00
Installing collected packages: scipy
Successfully installed scipy-1.15.3
In [85]:
# get data batch samples
images, labels = next(train_data)
augmented_images, augmented_labels = next(train_data_augmented) # labels aren't augmented by the way
In [86]:
# show original image and augmented image
random_number = random.randint(0,31) # since we're pulling a batch of 32, we'll be choosing between the index a random picture
plt.imshow(images[random_number])
plt.title(f'Original image')
plt.axis(False)
plt.figure()
plt.imshow(augmented_images[random_number])
plt.title(f'Augmented image')
plt.axis(False);
No description has been provided for this image
No description has been provided for this image

You can see the slight difference between original, and augmented. With the slight warping and cutting and rotating, this forces the model to learn patterns on less ideal images. Which is often the case with real world images.

Data augmentation is a great solution if you find that your model is overfitting too much. As for how strong augmentation should be, there's no set rule for it. It'd be best to check the options in ImageDataGenerator class, and think about your particular use case with a model and its data.

Now let's try refit the model on the augmented images for model_5

In [90]:
# create the model (same as model_5)
model_6 = Sequential([
    Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
    MaxPool2D(pool_size=2),
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Flatten(),
    Dense(1,activation='sigmoid')
])

# compile model
model_6.compile(loss='binary_crossentropy',
                optimizer='Adam',
                metrics=['accuracy'])

# fit the model
history_6 = model_6.fit(train_data_augmented,
                        epochs=5,
                        steps_per_epoch=len(train_data_augmented),
                        validation_data=test_data,
                        validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Epoch 1/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 404ms/step - accuracy: 0.5125 - loss: 0.8797 - val_accuracy: 0.4920 - val_loss: 0.6938
Epoch 2/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 390ms/step - accuracy: 0.4448 - loss: 0.6950 - val_accuracy: 0.5240 - val_loss: 0.6920
Epoch 3/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 382ms/step - accuracy: 0.5127 - loss: 0.6914 - val_accuracy: 0.5080 - val_loss: 0.6865
Epoch 4/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 381ms/step - accuracy: 0.5620 - loss: 0.6829 - val_accuracy: 0.6200 - val_loss: 0.6837
Epoch 5/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 390ms/step - accuracy: 0.5723 - loss: 0.6834 - val_accuracy: 0.5460 - val_loss: 0.6752

It appears the model didn't get as good of results this time. Why?

It's because we turned off data_shuffling with shuffle=False, which means the model sees the same batches of image over and over.

Our data is organized by folders, and it draws out only the pizza folder first. With it's batches, it only has pizzas to compare to. Not steak. Shuffle helps solve this issue by having that mix of pizza and steak in every batch.

Now knowing what's wrong, we can flip to shuffle=True, as we're done with demonstration purposes.

You can see how data augmentation also increases training time as well. But there's an option to possibly speed up with TensorFlow's parallel reads and buffered prefecting options.

In [91]:
# check model performance
plot_loss_curves(history_6)
No description has been provided for this image
No description has been provided for this image

Our accuracy is jumping quite a lot. It's heading into the right direction, but we want to aim for idealy a smooth decreasing ascent that reaches close to 1.

Now let's try the suffled augmented data

In [92]:
# import data and augment it from directories
train_data_augmented_shuffled = train_datagen_augmentation.flow_from_directory(train_dir,
                                                                               target_size=(224,224),
                                                                               batch_size=32,
                                                                               class_mode='binary',
                                                                               shuffle=True)
Found 1500 images belonging to 2 classes.
In [95]:
# create model, same as 5 and 6
model_7 = Sequential([
    Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
    MaxPool2D(pool_size=2),
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Flatten(),
    Dense(1,activation='sigmoid')
])

# compile model
model_7.compile(loss='binary_crossentropy',
                optimizer='Adam',
                metrics=['accuracy'])

# fit the model
history_7 = model_7.fit(train_data_augmented_shuffled,
                        epochs=5,
                        steps_per_epoch=len(train_data_augmented_shuffled),
                        validation_data=test_data,
                        validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\trainers\data_adapters\py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
Epoch 1/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 18s 367ms/step - accuracy: 0.5350 - loss: 0.7032 - val_accuracy: 0.7380 - val_loss: 0.5723
Epoch 2/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 17s 368ms/step - accuracy: 0.7090 - loss: 0.5906 - val_accuracy: 0.8240 - val_loss: 0.4201
Epoch 3/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 17s 363ms/step - accuracy: 0.7435 - loss: 0.5351 - val_accuracy: 0.8160 - val_loss: 0.3851
Epoch 4/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 17s 362ms/step - accuracy: 0.7734 - loss: 0.4889 - val_accuracy: 0.8280 - val_loss: 0.3554
Epoch 5/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 17s 358ms/step - accuracy: 0.7625 - loss: 0.5001 - val_accuracy: 0.8540 - val_loss: 0.3528
In [96]:
# check model's performance history training on augmented data
plot_loss_curves(history_7)
No description has been provided for this image
No description has been provided for this image

model_7 has shown how it has improved consistently compared to model_6, due to the shuffling done.

Also our loss curve does appear to be more smoother as well.

7. Repeat until satisfied¶

We've beaten the basline quite well, and there are more ways to continue to improve model:

  • Increase the model layers (e.g. more CNN layers)
  • Increase the filter numbers in each conv layers (e.g. from 10 to 32, 64, to 128. These are usual values for trial and error)
  • Train for longer (more epochs)
  • Finding an ideal learning rate
  • Get more data
  • Use transfer learning to leverage what other image models has learnt and adjust for our specific case

Adjusting these settings (except last 2) during development is known as hyperparameter tuning.

Let's go back to model_1 with the TinyVGG architecture

In [97]:
# Create a CNN model (same as Tiny VGG but for binary classification)
model_8 = Sequential([
    Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Conv2D(10,3,activation='relu'),
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Flatten(),
    Dense(1, activation='sigmoid')
])

# compile the model
model_8.compile(loss='binary_crossentropy',
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy'])

# fit the model
history_8 = model_8.fit(train_data_augmented_shuffled,
                        epochs=5,
                        steps_per_epoch=len(train_data_augmented_shuffled),
                        validation_data=test_data,
                        validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Epoch 1/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 22s 422ms/step - accuracy: 0.6118 - loss: 0.6556 - val_accuracy: 0.8280 - val_loss: 0.4363
Epoch 2/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 423ms/step - accuracy: 0.7695 - loss: 0.4911 - val_accuracy: 0.8680 - val_loss: 0.3613
Epoch 3/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 419ms/step - accuracy: 0.7753 - loss: 0.4775 - val_accuracy: 0.8860 - val_loss: 0.3220
Epoch 4/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 416ms/step - accuracy: 0.7944 - loss: 0.4477 - val_accuracy: 0.8620 - val_loss: 0.3247
Epoch 5/5
47/47 ━━━━━━━━━━━━━━━━━━━━ 20s 423ms/step - accuracy: 0.8036 - loss: 0.4459 - val_accuracy: 0.8880 - val_loss: 0.2907

Note: You may notice some difference between model_8 and model_1, mostly code from tensorflow.keras.layers to import Conv2D. It reduces the amount of code, but does the same thing, calling the same thing.

In [98]:
# check architecture of model_1
model_1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 222, 222, 10)   │           280 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D)               │ (None, 220, 220, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 110, 110, 10)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_2 (Conv2D)               │ (None, 108, 108, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_3 (Conv2D)               │ (None, 106, 106, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_1 (MaxPooling2D)  │ (None, 53, 53, 10)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten)               │ (None, 28090)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 1)              │        28,091 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 93,305 (364.48 KB)
 Trainable params: 31,101 (121.49 KB)
 Non-trainable params: 0 (0.00 B)
 Optimizer params: 62,204 (242.99 KB)
In [99]:
# check architecture of model_8
model_8.summary()
Model: "sequential_11"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_25 (Conv2D)              │ (None, 222, 222, 10)   │           280 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_26 (Conv2D)              │ (None, 220, 220, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_20 (MaxPooling2D) │ (None, 110, 110, 10)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_27 (Conv2D)              │ (None, 108, 108, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_28 (Conv2D)              │ (None, 106, 106, 10)   │           910 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_21 (MaxPooling2D) │ (None, 53, 53, 10)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_11 (Flatten)            │ (None, 28090)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_17 (Dense)                │ (None, 1)              │        28,091 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 93,305 (364.48 KB)
 Trainable params: 31,101 (121.49 KB)
 Non-trainable params: 0 (0.00 B)
 Optimizer params: 62,204 (242.99 KB)

Now let's check our TinyVGG model's performance

In [100]:
# check out the TinyVGG model performance
plot_loss_curves(history_8)
No description has been provided for this image
No description has been provided for this image
In [103]:
# visually compare with model_1's history
plot_loss_curves(history_1)
No description has been provided for this image
No description has been provided for this image

Training curve looks good, though improvement isn't that much impressive compared to the previous model. Maybe it's time to give the model more training time, aka more epochs.

Making a prediction with our trained model¶

What good is a trained model, if you can't make predictions with it? Let's upload a couple of our own images to see how the model reacts with it.

In [104]:
# classes we're working with
print(class_names)
['pizza' 'steak']

We'll use this steak image to do our first test

In [106]:
# view our example image
steak=mpimg.imread('03-steak.jpg')
plt.imshow(steak)
plt.axis(False);
No description has been provided for this image
In [107]:
# check the shape of our image
steak.shape
Out[107]:
(4032, 3024, 3)

Our model only takes the shape of (224, 224, 3), and thus we need to reshape our custom image, to fit the inputs for the model.

We'll need to decode the image with tf.io.read_file (for reading files) and tf.image (for resizing image/turning into tensor).

In [108]:
# create a function to import an image and resize it, so it can be used in model
def load_and_prep_image(filename, img_shape=224):
    """
    Reads an image from filename, turns it into a tensor and reshapes it to (img_shape, img_shape, colour channel)
    """
    # read in target file (on image)
    img = tf.io.read_file(filename)
    
    # decode read file into tensor, and confirm it still has 3 colour channels
    # (out model is trained with 3 colour channels, but some images may have 4)
    img = tf.image.decode_image(img, channels=3)

    # resize image to the size the model has been trained on
    img = tf.image.resize(img, size=[img_shape,img_shape])

    # rescale image so all values are within 0 and 1
    img = img/255.
    return img

We now have a function to load custom images for our model. Time to load in the image

In [109]:
# load in and preprocess custom image
steak = load_and_prep_image('03-steak.jpg')
steak
Out[109]:
<tf.Tensor: shape=(224, 224, 3), dtype=float32, numpy=
array([[[0.6377451 , 0.6220588 , 0.57892156],
        [0.6504902 , 0.63186276, 0.5897059 ],
        [0.63186276, 0.60833335, 0.5612745 ],
        ...,
        [0.52156866, 0.05098039, 0.09019608],
        [0.49509802, 0.04215686, 0.07058824],
        [0.52843136, 0.07745098, 0.10490196]],

       [[0.6617647 , 0.6460784 , 0.6107843 ],
        [0.6387255 , 0.6230392 , 0.57598037],
        [0.65588236, 0.63235295, 0.5852941 ],
        ...,
        [0.5352941 , 0.06862745, 0.09215686],
        [0.529902  , 0.05931373, 0.09460784],
        [0.5142157 , 0.05539216, 0.08676471]],

       [[0.6519608 , 0.6362745 , 0.5892157 ],
        [0.6392157 , 0.6137255 , 0.56764704],
        [0.65637255, 0.6269608 , 0.5828431 ],
        ...,
        [0.53137255, 0.06470589, 0.08039216],
        [0.527451  , 0.06862745, 0.1       ],
        [0.52254903, 0.05196078, 0.0872549 ]],

       ...,

       [[0.49313724, 0.42745098, 0.31029412],
        [0.05441177, 0.01911765, 0.        ],
        [0.2127451 , 0.16176471, 0.09509804],
        ...,
        [0.6132353 , 0.59362745, 0.57009804],
        [0.65294117, 0.6333333 , 0.6098039 ],
        [0.64166665, 0.62990195, 0.59460783]],

       [[0.65392154, 0.5715686 , 0.45      ],
        [0.6367647 , 0.54656863, 0.425     ],
        [0.04656863, 0.01372549, 0.        ],
        ...,
        [0.6372549 , 0.61764705, 0.59411764],
        [0.63529414, 0.6215686 , 0.5892157 ],
        [0.6401961 , 0.62058824, 0.59705883]],

       [[0.1       , 0.05539216, 0.        ],
        [0.48333332, 0.40882352, 0.29117647],
        [0.65      , 0.5686275 , 0.44019607],
        ...,
        [0.6308824 , 0.6161765 , 0.5808824 ],
        [0.6519608 , 0.63186276, 0.5901961 ],
        [0.6338235 , 0.6259804 , 0.57892156]]], dtype=float32)>

Nice, let's test it with the model!

In [111]:
model_8.predict(steak)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[111], line 1
----> 1 model_8.predict(steak)

File x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\utils\traceback_utils.py:122, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    119     filtered_tb = _process_traceback_frames(e.__traceback__)
    120     # To get the full stack trace, call:
    121     # `keras.config.disable_traceback_filtering()`
--> 122     raise e.with_traceback(filtered_tb) from None
    123 finally:
    124     del filtered_tb

File x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\models\functional.py:276, in Functional._adjust_input_rank(self, flat_inputs)
    274             adjusted.append(ops.expand_dims(x, axis=-1))
    275             continue
--> 276     raise ValueError(
    277         f"Invalid input shape for input {x}. Expected shape "
    278         f"{ref_shape}, but input has incompatible shape {x.shape}"
    279     )
    280 # Add back metadata.
    281 for i in range(len(flat_inputs)):

ValueError: Exception encountered when calling Sequential.call().

Invalid input shape for input Tensor("data:0", shape=(32, 224, 3), dtype=float32). Expected shape (None, 224, 224, 3), but input has incompatible shape (32, 224, 3)

Arguments received by Sequential.call():
  • inputs=tf.Tensor(shape=(32, 224, 3), dtype=float32)
  • training=False
  • mask=None
  • kwargs=<class 'inspect._empty'>

Somethings wrong... The image is in the same shape for what the model expects, but there's an extra dimension missing.

This is likely from the batch size dimension > (batch_size, 224, 224, 3).

We can fix this by adding extra dims with tf.expand_dims.

In [112]:
# add an extra axis
print(f'Shape before new dimension: {steak.shape}')
steak = tf.expand_dims(steak, axis=0) # add an extra dimension at axis 0
# steak = steak[tf.newaxis, ...] is the alternative for the above
print(f'Shape after new dimension: {steak.shape}')
steak
Shape before new dimension: (224, 224, 3)
Shape after new dimension: (1, 224, 224, 3)
Out[112]:
<tf.Tensor: shape=(1, 224, 224, 3), dtype=float32, numpy=
array([[[[0.6377451 , 0.6220588 , 0.57892156],
         [0.6504902 , 0.63186276, 0.5897059 ],
         [0.63186276, 0.60833335, 0.5612745 ],
         ...,
         [0.52156866, 0.05098039, 0.09019608],
         [0.49509802, 0.04215686, 0.07058824],
         [0.52843136, 0.07745098, 0.10490196]],

        [[0.6617647 , 0.6460784 , 0.6107843 ],
         [0.6387255 , 0.6230392 , 0.57598037],
         [0.65588236, 0.63235295, 0.5852941 ],
         ...,
         [0.5352941 , 0.06862745, 0.09215686],
         [0.529902  , 0.05931373, 0.09460784],
         [0.5142157 , 0.05539216, 0.08676471]],

        [[0.6519608 , 0.6362745 , 0.5892157 ],
         [0.6392157 , 0.6137255 , 0.56764704],
         [0.65637255, 0.6269608 , 0.5828431 ],
         ...,
         [0.53137255, 0.06470589, 0.08039216],
         [0.527451  , 0.06862745, 0.1       ],
         [0.52254903, 0.05196078, 0.0872549 ]],

        ...,

        [[0.49313724, 0.42745098, 0.31029412],
         [0.05441177, 0.01911765, 0.        ],
         [0.2127451 , 0.16176471, 0.09509804],
         ...,
         [0.6132353 , 0.59362745, 0.57009804],
         [0.65294117, 0.6333333 , 0.6098039 ],
         [0.64166665, 0.62990195, 0.59460783]],

        [[0.65392154, 0.5715686 , 0.45      ],
         [0.6367647 , 0.54656863, 0.425     ],
         [0.04656863, 0.01372549, 0.        ],
         ...,
         [0.6372549 , 0.61764705, 0.59411764],
         [0.63529414, 0.6215686 , 0.5892157 ],
         [0.6401961 , 0.62058824, 0.59705883]],

        [[0.1       , 0.05539216, 0.        ],
         [0.48333332, 0.40882352, 0.29117647],
         [0.65      , 0.5686275 , 0.44019607],
         ...,
         [0.6308824 , 0.6161765 , 0.5808824 ],
         [0.6519608 , 0.63186276, 0.5901961 ],
         [0.6338235 , 0.6259804 , 0.57892156]]]], dtype=float32)>
In [113]:
# make a prediction on custom image
pred = model_8.predict(steak)
pred
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 175ms/step
Out[113]:
array([[0.8684933]], dtype=float32)

Our prediction comes out as a probability between pizza and steak. Knowing it's binary classification, 0.5 is the middle between steak and pizza prediction. So with a value of 0.86, is most likely going to be the positive class of 1.

But positive or negative class don't give much clue as to whether it's pizza or steak. So we should write a function to convert the prediction probabilities to the class names.

In [114]:
# we can index the predicted class by rounding the probability
pred_class = class_names[int(tf.round(pred)[0][0])] # the [0][0] selects the value from the 2D tensor, after rounding > [[1.0]] is what the number actually looks like when rounded
pred_class
Out[114]:
np.str_('steak')
In [118]:
def pred_and_plot(model, filename, class_names):
    """
    Imports an image located at filename, makes prediction with model, and plots image with predicted class as title
    """
    # import the target image/preprocess it
    img = load_and_prep_image(filename)

    # make a prediction
    pred = model.predict(tf.expand_dims(img, axis=0))

    # get the predicted value
    pred_class = class_names[int(tf.round(pred)[0][0])]

    # plot the image and predicted label
    plt.imshow(img)
    plt.title(f"Prediction: {pred_class}")
    plt.axis(False);
In [119]:
# test our model with custom image
pred_and_plot(model_8, '03-steak.jpg', class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 83ms/step
No description has been provided for this image

Nice, the model got it correct!

In [123]:
# download another image to make a prediction
pred_and_plot(model_8, '03-pizza-dad.jpeg', class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
No description has been provided for this image

Multi-class classification¶

We've referenced TinyVGG architecture from the CNN explainer website. But the CNN explainer worked with 10 categories, over binary classification like ours.

Let's go through the same steps again, but this time, we'll work with 10 different categories of food.

alt text

The workflow we're doing is a slightly modified version of the above. As you do more deep learning, the above workflow becomes more like an outline, rather than a step-by-step guide

1. Import and become one with the data¶

Again going back to the Food101 dataset, in addition to the steak and pizza, we'll pull out 8 other categories to satisfy our next challenge.

In [124]:
import zipfile
import urllib.request

# Step 1: Download the zip file 10_food_classes
url = "https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_all_data.zip"
urllib.request.urlretrieve(url, "10_food_classes_all_data.zip")  # saves the file locally

# Step 2: Unzip the file
with zipfile.ZipFile("10_food_classes_all_data.zip", "r") as zip_ref:
    zip_ref.extractall()  # extract all files to the current directory

Now let's check out all the different directories/sub-directories in the 10_food_classes file.

In [126]:
import os

# walk through 10_food_classes directory and list number of files
for dirpath, dirnames, filenames in os.walk('10_food_classes_all_data'):
    print(f'There are {len(dirnames)} directories and {len(filenames)} images in "{dirpath}".')
There are 2 directories and 0 images in "10_food_classes_all_data".
There are 10 directories and 0 images in "10_food_classes_all_data\test".
There are 0 directories and 250 images in "10_food_classes_all_data\test\ice_cream".
There are 0 directories and 250 images in "10_food_classes_all_data\test\chicken_curry".
There are 0 directories and 250 images in "10_food_classes_all_data\test\steak".
There are 0 directories and 250 images in "10_food_classes_all_data\test\sushi".
There are 0 directories and 250 images in "10_food_classes_all_data\test\chicken_wings".
There are 0 directories and 250 images in "10_food_classes_all_data\test\grilled_salmon".
There are 0 directories and 250 images in "10_food_classes_all_data\test\hamburger".
There are 0 directories and 250 images in "10_food_classes_all_data\test\pizza".
There are 0 directories and 250 images in "10_food_classes_all_data\test\ramen".
There are 0 directories and 250 images in "10_food_classes_all_data\test\fried_rice".
There are 10 directories and 0 images in "10_food_classes_all_data\train".
There are 0 directories and 750 images in "10_food_classes_all_data\train\ice_cream".
There are 0 directories and 750 images in "10_food_classes_all_data\train\chicken_curry".
There are 0 directories and 750 images in "10_food_classes_all_data\train\steak".
There are 0 directories and 750 images in "10_food_classes_all_data\train\sushi".
There are 0 directories and 750 images in "10_food_classes_all_data\train\chicken_wings".
There are 0 directories and 750 images in "10_food_classes_all_data\train\grilled_salmon".
There are 0 directories and 750 images in "10_food_classes_all_data\train\hamburger".
There are 0 directories and 750 images in "10_food_classes_all_data\train\pizza".
There are 0 directories and 750 images in "10_food_classes_all_data\train\ramen".
There are 0 directories and 750 images in "10_food_classes_all_data\train\fried_rice".

Looks good! Now we set up train and test directory path.

In [127]:
train_dir = '10_food_classes_all_data/train/'
test_dir = '10_food_classes_all_data/test/'

Get the class names from subdirectories

In [128]:
# get the class names for our multi-class dataset
import pathlib
import numpy as np
data_dir = pathlib.Path(train_dir)
class_names = np.array(sorted([item.name for item in data_dir.glob('*')])) # * > matching everything inside the directory
print(class_names)
['chicken_curry' 'chicken_wings' 'fried_rice' 'grilled_salmon' 'hamburger'
 'ice_cream' 'pizza' 'ramen' 'steak' 'sushi']
In [136]:
# view a random image in the file
import random
img = view_random_image(target_dir=train_dir, target_class=random.choice(class_names)) # gets a random class name
Image shape: (512, 382, 3)
No description has been provided for this image

2. Preprocess the data (prepare it for a model)¶

After going through some images (10-100), looks like everything is set up correctly.

Time to preprocess the data

In [139]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# rescale the data and create data generator instances
train_datagen = ImageDataGenerator(rescale=1/255.)
test_datagen = ImageDataGenerator(rescale=1/255.)

# load data into directory, and turn them into batches
train_data = train_datagen.flow_from_directory(train_dir,
                                               target_size=(224,224),
                                               batch_size=32,
                                               class_mode='categorical')

test_data = test_datagen.flow_from_directory(test_dir,
                                             target_size=(224,224),
                                             batch_size=32,
                                             class_mode='categorical')
Found 7500 images belonging to 10 classes.
Found 2500 images belonging to 10 classes.

The main change from our binary classifier, to 10 categories, is changing class_mode from class_mode to categorical. Everything else stays the same.

Question: But why do we set our image as 224x224? It's often a very common default size for preprocessing images. But depends on your problem, whether a bigger or smaller image is needed.

3. Create a model (start with a baseline)¶

We can use the same model (TinyVGG) we've used for binary classification problem, for our multi-class classification problem with a couple of small tweaks.

Namely:

  • Changing the output to have 10 output neurons (the same number of categories).
  • Changing the output layer to use softmax over sigmoid.
  • Changing the loss function to be categorical_crossentropy over binary_crossentropy.
In [145]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense

# create our model (clone of model_8, but for classification)
model_9 = Sequential([
    Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Conv2D(10,3,activation='relu'),
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Flatten(),
    Dense(10,activation='softmax')
])

# compile the model
model_9.compile(loss='categorical_crossentropy',
                optimizer='Adam',
                metrics=['accuracy'])

4. Fit a model¶

Now we've got a model fit for multi classes. Let's start fitting it

In [146]:
# fit the model
history_9 = model_9.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=test_data,
                        validation_steps=len(test_data))
Epoch 1/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 84s 351ms/step - accuracy: 0.1707 - loss: 2.2480 - val_accuracy: 0.2756 - val_loss: 1.9998
Epoch 2/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 48s 205ms/step - accuracy: 0.3320 - loss: 1.9258 - val_accuracy: 0.2648 - val_loss: 2.0030
Epoch 3/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 44s 186ms/step - accuracy: 0.4773 - loss: 1.5691 - val_accuracy: 0.2796 - val_loss: 2.0894
Epoch 4/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 43s 182ms/step - accuracy: 0.7336 - loss: 0.8674 - val_accuracy: 0.2568 - val_loss: 2.6161
Epoch 5/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 43s 181ms/step - accuracy: 0.9222 - loss: 0.3081 - val_accuracy: 0.2904 - val_loss: 3.7356

It's noticeably longer to train the model, despite the same epoch of our binary class model. This is due to the amount of images the model has to process through compared to binary. Train has 750 images, and test has 250 images. Our binary had 2 classes, but this one has 10 classes. Our model above is dealing with 5 times as much data as the binary.

5. Evaluate the model¶

Yay we've trained the model :) Let's see it visually

In [147]:
# evaluate on test data
model_9.evaluate(test_data)
79/79 ━━━━━━━━━━━━━━━━━━━━ 8s 95ms/step - accuracy: 0.3031 - loss: 3.6326
Out[147]:
[3.735577344894409, 0.2903999984264374]
In [148]:
# check model curves
plot_loss_curves(history_9)
No description has been provided for this image
No description has been provided for this image

Hmm, pretty poor results on our training and validation loss curves. So what does this say?

It looks like the model has overfit our training accuracy, and is poorly generalizing/predicting on data it's unfamiliar with.

6. Adjust the model parameters¶

It's clear that the model is learning something, but it's not in the intended direction. We ideally want our validation to perform as well as training. So our next steps is to try and prevent overfitting from occuring.

  • Get more data - Simplest but hardest answer. It gives more opportunity for the model to learn patterns
  • Simplify model - If the model overfits, it can mean the model is too complicated. It's learning patterns too well, or learning patterns that aren't really patterns. Which makes it hard to generalize on unseen data. We can reduce layers or the number of hidden units.
  • Use data augmentation - Manipulating the training data slightly, which makes learning harder/adds more variety in data. If the model can learn from augmented data, then it may be able to generalize better on unseen data.
  • Use transfer learning - We can leverage an already trained model, who's recognized patterns in similar data, into the foundation of our own task. Such as using a computer vision model, and tweak it slightly to suit our food data.

For now, lets simplify the model first. We'll remove two conv layer, so our layers go from four to two.

In [149]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense

# create our model (clone of model_8, but for classification)
model_10 = Sequential([
    Conv2D(10,3,activation='relu',input_shape=(224,224,3)),
    MaxPool2D(),
    Conv2D(10,3,activation='relu'),
    MaxPool2D(),
    Flatten(),
    Dense(10,activation='softmax')
])

# compile the model
model_10.compile(loss='categorical_crossentropy',
                optimizer='Adam',
                metrics=['accuracy'])

# fit the model
history_10 = model_10.fit(train_data,
                         epochs=5,
                         steps_per_epoch=len(train_data),
                         validation_data=test_data,
                         validation_steps=len(test_data))
x:\Anaconda3\envs\tf_gpu\lib\site-packages\keras\src\layers\convolutional\base_conv.py:113: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Epoch 1/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 39s 160ms/step - accuracy: 0.1917 - loss: 2.1982 - val_accuracy: 0.2984 - val_loss: 1.9639
Epoch 2/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 35s 150ms/step - accuracy: 0.4483 - loss: 1.6929 - val_accuracy: 0.3288 - val_loss: 1.9280
Epoch 3/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 36s 152ms/step - accuracy: 0.6271 - loss: 1.2168 - val_accuracy: 0.2868 - val_loss: 2.1241
Epoch 4/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 37s 158ms/step - accuracy: 0.7917 - loss: 0.7238 - val_accuracy: 0.2924 - val_loss: 2.4491
Epoch 5/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 37s 157ms/step - accuracy: 0.9233 - loss: 0.3336 - val_accuracy: 0.2752 - val_loss: 3.0246
In [150]:
# check out loss curves of model_10
plot_loss_curves(history_10)
No description has been provided for this image
No description has been provided for this image

Well it seems our simplified model didn't work. Maybe we try data augmentation?

To do this, we need ImageDataGenerator instance. This time we'll add parameters such as rotation_range and horizontal_flip to manipulate our images.

In [152]:
# create augmented data generator instance
train_datagen_augmented = ImageDataGenerator(rescale=1/255.,
                                          rotation_range=20,
                                          width_shift_range=0.2,
                                          height_shift_range=0.2,
                                          zoom_range=0.2,
                                          horizontal_flip=True)

train_data_augmented = train_datagen_augmented.flow_from_directory(train_dir,
                                                                   target_size=(224,224),
                                                                   batch_size=32,
                                                                   class_mode='categorical')
Found 7500 images belonging to 10 classes.

After augmentation, we can try again with model_10. We don't have to rewrite the model, but use a handy tf function clone_model, which takes an existing model and rebuilt it in the same format.

The cloned model doesn't transport the learning the original model has done. So you basically train on the model with a clean slate.

Note: Key practices in deep learning is to be a serial experimenter. Trying something, see if it works, then try something else. A good experimenter also keeps track of all the things that were changed per step, and what results came of it. For our example, it's augmenting the data, and trying it on our previous model, to see if anything is changed from what we see on the loss curves

In [154]:
# clone the model
model_11 = tf.keras.models.clone_model(model_10)

# compile model
model_11.compile(loss='categorical_crossentropy',
                 optimizer='Adam',
                 metrics=['accuracy'])

# fit the model
history_11 = model_11.fit(train_data_augmented,
             epochs=5,
             steps_per_epoch=len(train_data_augmented),
             validation_data=test_data,
             validation_steps=len(test_data))
Epoch 1/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 104s 439ms/step - accuracy: 0.1404 - loss: 2.4096 - val_accuracy: 0.2552 - val_loss: 2.0999
Epoch 2/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 104s 444ms/step - accuracy: 0.2399 - loss: 2.1277 - val_accuracy: 0.2788 - val_loss: 2.0456
Epoch 3/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 101s 429ms/step - accuracy: 0.2750 - loss: 2.0650 - val_accuracy: 0.3332 - val_loss: 1.9077
Epoch 4/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 101s 429ms/step - accuracy: 0.2827 - loss: 2.0143 - val_accuracy: 0.3288 - val_loss: 1.9321
Epoch 5/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 102s 435ms/step - accuracy: 0.3084 - loss: 1.9931 - val_accuracy: 0.3700 - val_loss: 1.8488

You can see how much longer it takes, due to augmentation of images while training, and done using CPU rather than GPU.

Note: One way to imporve time taken, is use tf.keras.layers.RandomFlip to flip images horizontally. Data loading can be speed up with tf.keras.utils.image_dataset_from_directory, which is an image loading API. (Will be covered later on)

So how's the model curves?

In [155]:
# check out model performance of model_11
plot_loss_curves(history_11)
No description has been provided for this image
No description has been provided for this image

Performance definitely looks better! Loss curves are better, and although augmented training test is quite low, validation dataset performed much better this time. Looks as if the model may continue to improve at a constant rate beyond our 5 epochs.

7. Repeat until satified¶

We can continue with things like restructuring model architevture, add more layers, adjust learning rate, different methods of augmentation etc. As you can guess, it takes a long ass time.

Good thing, there's the trick of transfer learning

We'll save that for the next notebook. But in the meantime, let's make a prediction with our trained multi-class model.

Making a prediction with our trained model¶

What good is a model, if you can't make predictions with it?

Let's remind ourselves the categories we're dealing with in the 10 categorical food items, before adding some custom images.

In [156]:
# what are our class names?

class_names
Out[156]:
array(['chicken_curry', 'chicken_wings', 'fried_rice', 'grilled_salmon',
       'hamburger', 'ice_cream', 'pizza', 'ramen', 'steak', 'sushi'],
      dtype='<U14')

Let's get some custom images.

And nowq we'll use pred_and_plot function to make prediction with model_11 on an image, and see it's output.

In [157]:
# make prediction on model_11 with custom image
pred_and_plot(model=model_11,
              filename='03-steak.jpeg',
              class_names=class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 1s/step
No description has been provided for this image

Not correct, but let's try another image

In [159]:
pred_and_plot(model=model_11,
              filename='03-sushi.jpeg',
              class_names=class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 86ms/step
No description has been provided for this image

still chicken curry :(

In [160]:
pred_and_plot(model_11, '03-pizza-dad.jpeg', class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step
No description has been provided for this image

lmao chicken curry

maybe there's something wrong with the pred_and_plot function. Let's make a prediction outside of that function instead.

In [165]:
# load in and preprocess our custom image
img = load_and_prep_image('03-steak.jpeg')

# make prediction
pred = model_11.predict(tf.expand_dims(img,axis=0))

# match the prediction class to the highest prediction probability
pred_class=class_names[pred.argmax()]
plt.imshow(img)
plt.title(pred_class)
plt.axis(False);
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step
No description has been provided for this image

The prediction is wrong, but now it's different.

The issue with our previous function, is most likely since it's only used for binary classification, and has no capability for handling multi classification data. Main problem lies in the prediction function.

In [167]:
# check the output of the predict function
pred = model_11.predict(tf.expand_dims(img,axis=0))
pred
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step
Out[167]:
array([[0.03498406, 0.112686  , 0.070467  , 0.21361639, 0.08432788,
        0.15790951, 0.02166959, 0.07102852, 0.19622356, 0.03708745]],
      dtype=float32)

our model has softmax activation function with 10 output neurons, where each neurons is outputted a prediction probability.

We can use argmax to find through class_names, what was deemed most likely to be the food item from the model.

In [168]:
# find the predicted class name
class_names[pred.argmax()]
Out[168]:
np.str_('grilled_salmon')

Knowing that, we can readjust pred_and_plot function to work with multiple classes as well as binary classes.

In [169]:
# adjust function to work with multi-class
def pred_and_plot(model, filename, class_names):
    '''
    Imports an image located at filename, makes a prediction on it with
    a trained model and plots the image with the predicted class as the title.
    '''
    # import the target image and preprocess it
    img = load_and_prep_image(filename)

    # make a prediction
    pred = model.predict(tf.expand_dims(img,axis=0))

    # get the predicted class
    if len(pred[0]) > 1: # check for multiclass
        pred_class = class_names[pred.argmax()] # if more than one output, take the max value
    else:
        pred_class = class_names[int(tf.round(pred)[0][0])] # if only one output, round up value that's > 0.5

    # plot the image and predicted class
    plt.imshow(img)
    plt.title(f'Prediction: {pred_class}')
    plt.axis(False);

let's try it out, now that it shouldn't show chicken curry

In [170]:
pred_and_plot(model_11, '03-steak.jpeg', class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 111ms/step
No description has been provided for this image
In [171]:
pred_and_plot(model_11, "03-sushi.jpeg", class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
No description has been provided for this image
In [172]:
pred_and_plot(model_11, "03-pizza-dad.jpeg", class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step
No description has been provided for this image
In [173]:
pred_and_plot(model_11, "03-hamburger.jpeg", class_names)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step
No description has been provided for this image

unfortunate that the predictions aren't accurate, and only having around 35% accuracy in total with validation data.

We will improve on this later on through 'transfer learning'

Saving and loading our model¶

Once model is trained, you probably want to save it and load it elsewhere.

To do so, we can use save and load_model functions.

In [175]:
# save a model
model_11.save('saved_trained_model.keras')
In [176]:
# load in a model and evaluate it
loaded_model_11 = tf.keras.models.load_model('saved_trained_model.keras')
loaded_model_11.evaluate(test_data)
79/79 ━━━━━━━━━━━━━━━━━━━━ 27s 326ms/step - accuracy: 0.3824 - loss: 1.8630
Out[176]:
[1.848818302154541, 0.3700000047683716]
In [177]:
# compare saved model to unsaved model
model_11.evaluate(test_data)
79/79 ━━━━━━━━━━━━━━━━━━━━ 6s 76ms/step - accuracy: 0.3665 - loss: 1.8623
Out[177]:
[1.8488181829452515, 0.3700000047683716]

Exercises¶

  1. Check out CNN explainer website, and note the keyy terms/explain convolution, pooling, etc. in your own words.

What is CNN?¶

Its a type of classifier, suited to image prediction. You have an input of a tensor (n-dimensions of a matrix/dataset), with neurons that have inputs, which all yield to a singular output. These can then create a filter that's applied on top of the image to enhance or select important features of the image. In addition to weights and biases on every neuron, to tweak if prediction from model is off.

How does a convolutional layer work?¶

It starts with the kernel size, aka a grid size, with weights attached to it, to help it discern important details of an image. The weights apply a multiplication value to each pixel in the set grid, say 3x3, and all pixel values that's multiplied by weight, is then outputted as a single value when added together. This becomes the new value of our first pixel, and this continues for the next pixel. This is known as a dot product of 1. If its 2, it moves to every 2 pixels.

Due to images having rgb channels, making 3 separate images, the 3 image values are added up together as a final output, but with bias added, based on what the model think will help prediction accuracy.

Each of these filters are there to detect specific features, such as edges, eyes, curves, etc.

ReLU and Softmax¶

ReLU is widely used for its non-linearity predictions. Which makes it great for complex models, because not every prediction is a linear straight line. How it does it, is it keeps all positive values the same, but returns all negative values as 0.

Softmax is typically used on multi classification problems. When there is an output of more than 2 classes, they don't add up neatly to 1. Softmax does a sort of normalization/regularization to all the values, so that all classes will add up to 1, aka 100%.

Max Pooling¶

Operates by shrinking image size, but simultaneously keeping the most important information within. Typically a 2x2 grid area, it often looks for the largest value in that grid area/kernel slice. Then can either move next pixel, or second pixel etc.

In [1]:
pip install nbconvert PyPDF2
Requirement already satisfied: nbconvert in x:\anaconda3\lib\site-packages (7.10.0)
Requirement already satisfied: PyPDF2 in x:\anaconda3\lib\site-packages (3.0.1)
Requirement already satisfied: beautifulsoup4 in x:\anaconda3\lib\site-packages (from nbconvert) (4.12.2)
Requirement already satisfied: bleach!=5.0.0 in x:\anaconda3\lib\site-packages (from nbconvert) (4.1.0)
Requirement already satisfied: defusedxml in x:\anaconda3\lib\site-packages (from nbconvert) (0.7.1)
Requirement already satisfied: jinja2>=3.0 in x:\anaconda3\lib\site-packages (from nbconvert) (3.1.3)
Requirement already satisfied: jupyter-core>=4.7 in x:\anaconda3\lib\site-packages (from nbconvert) (5.5.0)
Requirement already satisfied: jupyterlab-pygments in x:\anaconda3\lib\site-packages (from nbconvert) (0.1.2)
Requirement already satisfied: markupsafe>=2.0 in x:\anaconda3\lib\site-packages (from nbconvert) (2.1.3)
Requirement already satisfied: mistune<4,>=2.0.3 in x:\anaconda3\lib\site-packages (from nbconvert) (2.0.4)
Requirement already satisfied: nbclient>=0.5.0 in x:\anaconda3\lib\site-packages (from nbconvert) (0.8.0)
Requirement already satisfied: nbformat>=5.7 in x:\anaconda3\lib\site-packages (from nbconvert) (5.9.2)
Requirement already satisfied: packaging in x:\anaconda3\lib\site-packages (from nbconvert) (23.1)
Requirement already satisfied: pandocfilters>=1.4.1 in x:\anaconda3\lib\site-packages (from nbconvert) (1.5.0)
Requirement already satisfied: pygments>=2.4.1 in x:\anaconda3\lib\site-packages (from nbconvert) (2.15.1)
Requirement already satisfied: tinycss2 in x:\anaconda3\lib\site-packages (from nbconvert) (1.2.1)
Requirement already satisfied: traitlets>=5.1 in x:\anaconda3\lib\site-packages (from nbconvert) (5.7.1)
Requirement already satisfied: six>=1.9.0 in x:\anaconda3\lib\site-packages (from bleach!=5.0.0->nbconvert) (1.16.0)
Requirement already satisfied: webencodings in x:\anaconda3\lib\site-packages (from bleach!=5.0.0->nbconvert) (0.5.1)
Requirement already satisfied: platformdirs>=2.5 in x:\anaconda3\lib\site-packages (from jupyter-core>=4.7->nbconvert) (3.10.0)
Requirement already satisfied: pywin32>=300 in x:\anaconda3\lib\site-packages (from jupyter-core>=4.7->nbconvert) (305.1)
Requirement already satisfied: jupyter-client>=6.1.12 in x:\anaconda3\lib\site-packages (from nbclient>=0.5.0->nbconvert) (7.4.9)
Requirement already satisfied: fastjsonschema in x:\anaconda3\lib\site-packages (from nbformat>=5.7->nbconvert) (2.16.2)
Requirement already satisfied: jsonschema>=2.6 in x:\anaconda3\lib\site-packages (from nbformat>=5.7->nbconvert) (4.19.2)
Requirement already satisfied: soupsieve>1.2 in x:\anaconda3\lib\site-packages (from beautifulsoup4->nbconvert) (2.5)
Requirement already satisfied: attrs>=22.2.0 in x:\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (23.1.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in x:\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (2023.7.1)
Requirement already satisfied: referencing>=0.28.4 in x:\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.30.2)
Requirement already satisfied: rpds-py>=0.7.1 in x:\anaconda3\lib\site-packages (from jsonschema>=2.6->nbformat>=5.7->nbconvert) (0.10.6)
Requirement already satisfied: entrypoints in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (0.4)
Requirement already satisfied: nest-asyncio>=1.5.4 in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (1.6.0)
Requirement already satisfied: python-dateutil>=2.8.2 in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (2.8.2)
Requirement already satisfied: pyzmq>=23.0 in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (24.0.1)
Requirement already satisfied: tornado>=6.2 in x:\anaconda3\lib\site-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (6.3.3)
Note: you may need to restart the kernel to use updated packages.
In [ ]: